CPSC's SREM Identification Blueprint

A Reproducible AI-Assisted Refinement of CPSC's Prevailing 2012 Magnet Set Injury Identification Strategy


1. SREM Identification Overview


In 2012, the Consumer Product Safety Commission (CPSC) embarked on a mission to prohibit the sale of Spherical Rare Earth Magnets (SREM), issuing a "Safety Standard for Magnet Sets" rulemaking notice, which effectively banned 5mm neodymium magnets spheres such as Buckyballs and Zen Magnets for all ages. The CPSC's Directorate for Epidemiology staff announced a disturbing pattern during this period. Between 2009 and 2011, the National Electronic Injury Surveillance System (NEISS) registered 72 "in-scope" incidents likely related to the ingestion of SREM. This data extrapolated to an alarming national injury estimate of around 1,716 cases for those three years. CPSC reasoned that this figure mirrored the potential societal advantages the proposed rule might bring—namely, averting such incidents. In monetary terms, each Emergency Department incident translated to a societal cost of $14,500 according to CPSC in 2012, inclusive of ICM injury estimates (https://www.federalregister.gov/d/2012-21608/p-121)

This report probes into the CPSC's 2012 methodology in SREM classification, and provides a replicable, AI-augmented approach using the GPT4 Quorum method, in order to mimic CPSC's prevailing behavior in the selection of those 72 In Scope SREM incidents. For comparison, the proposed method similarly estimates 1,716 cases from 2009-2011, although there are some inconsistencies of specific selections. Whether the CPSC's SREM identification is reasonable or valid, is discussed elsewhere, and is not within the scope of this report. Here, the purpose is only to show how to emulate this identification method systematically, and discuss how CPSC's inconsistencies were considered in determining their prevailing behavior in selecting "In-Scope" SREM injuries.

Notably, these specific 72 incidents are the only known list for what CPSC has considered "in-scope" in magnet set related rulemaking. The CPSC has not published any similar "in-scope" incident list for: the November 2014 Final Rule (that was effective from 2015 to 2016), the January 2022 NPR, or the October 2022 Final Rule on magnet sets. And thus, It is impossible to vet the consistency, validity or efficacy of the CPSC's "in-scope" selections for any those regulatory documents. Each of these 72 incidents are also flagged in column AQ of the 'MAGN_Only' sheet in the MagnetSafety.org 2003-2022 NEISS Analysis.


2. Deciphering the CPSC's Criteria for SREM Identification

The NEISS database stands as a testament to the CPSC's commitment to transparency. However, the CPSC occasionally leaves analysts in the dark about the rationale behind their categorization of incidents for regulatory purposes. This makes the 72 SREM incidents identified during 2009-2012 a valuable resource for comprehension (https://www.federalregister.gov/d/2012-21608/p-119). The CPSC's description of these incidents alludes to "high-powered and/or ball-shaped magnet ingestions" that potentially pertain to the magnets in question. Still, the actual selection criteria employed by the CPSC is neither straightforward nor uniformly applied.


As seen in the spreadsheet, estimates for identifying SREM are more nuanced than just searching for "MAGN" or "BATT". Drawing insights from the CPSC's approach during 2009-2012, we deduce a two-step general strategy:


Detailed Implementation of the SREM Identification Strategy


Step 1: Prematching

An algorithmic solution similar to MAGN identification (See 2022 NEISS data overview) can facilitate this step, with spreadsheet software like Google Sheets or Excel serving as a useful tool. Within the MAGN incident data set, identify any incidents with the following terms: POWER, RARE, MARB, BALL, BB, BEARING, BEAD, SPHER, ROUND, PIERCE, TONGUE, TOUNG, LIP. Each of these can be flagged in their own column for visual overview, or the following combined formula can check all keywords at once. For example:


=IF(OR(ISNUMBER(SEARCH("POWER",W1)), ISNUMBER(SEARCH("RARE",W1)), ISNUMBER(SEARCH("MARB",W1)), ISNUMBER(SEARCH("BALL",W1)), ISNUMBER(SEARCH("BB",W1)), ISNUMBER(SEARCH("BEARING",W1)), ISNUMBER(SEARCH("BEAD",W1)), ISNUMBER(SEARCH("SPHER",W1)), ISNUMBER(SEARCH("ROUND",W1)), ISNUMBER(SEARCH("PIERCE",W1)), ISNUMBER(SEARCH("TONGUE",W1)), ISNUMBER(SEARCH("TOUNG",W1)), ISNUMBER(SEARCH("LIP",W1))), 1, 0)


Step 2: Subjective Filtering

Injecting subjectivity into a classification method requires finesse, especially if the goal is to mimic the CPSC's approach while ensuring reproducibility. To this end, we harnessed the capabilities of AI, with the use of a GPT-4 . A detailed tutorial on how and when to implement as GPT-4 quorum is available (here). In summary, AI does an excellent job discerning which narratives align with SREM criteria and which don't, based on a set of predefined rules and exceptions found to mimic the behavior of CPSC staff in 2012. 

From the set of all Prematched incidents found in Step 1 (See Column AP, MAGN_Only sheet), the following Workspace GPT formula merges instruction prompt with narrative content from cell W24:


GPT Workspace Formula

=GPT(CONCATENATE("The instructions below normally require a brief output, but I would like you to explain your thinking, step by step

Task: Evaluate a NEISS incident narrative related to a potential ingestion of a rare earth spherical magnet set, such as buckyballs.


Response Format:

If the ingestion matches any exclusion criteria, respond with only a two-word reason. Do not provide any explanation or use additional words in your response.

Otherwise, if the ingestion doesn't match any exclusion criteria, respond with only the number: 1.


Exclusion Criteria (Reasons):

A. Metal Ball: If the ball ingested is metal or steel, often as a part of a magnetic product known to use metal balls (like geomag, magnetix for example). Consider that multiple objects may have been ingested, in which case the 'Metal Ball' exclusion criteria applies if the the ball ingested is not described as a magnetic ball. EXCEPTION: In case the ingested ball has a contradictory description like 'MAGNETIZED METAL BEAD' or 'METAL MAGNETIC BALL', do not use 'Metal Ball' exclusion criteria and output 1.

B. Non Magnetic: If the ingested ball is non magnetic. May refer to non-magnetic balls or a component of a decoration involving magnets but is not itself magnetized (e.g., a ball off an old magnet decoration or a part of a magnetic toy set not described as magnetic itself).

C. Shape Mismatch: If the magnet ingested specifically isn't spherical in shape, like a magnetic tube, cylinder, or disc.

D. Bracelet: The ingested item is explicitly part of a bracelet or similar wrist jewelry.

E. Not Ingested: If narrative doesn't indicate swallowing of a round and/or strong magnet. Magnets inserted in the nostrils are not considered ingestions, for example.

F. Uncertain Ingestion: If the narrative expresses uncertainty about the existence of an ingestions.


There might be other unique situations not covered by the above criteria, in which case come up with a creative 2 word reason for the exclusion. Remember, there might also be instances where multiple objects are ingested simultaneously. Your output should either a two word explanation like 'Metal Ball' if excluded, or '1' if not excluded.


Now, please take a deep breath and slowly evaluate the following NEISS narrative:


", $W24))


This prompt was found to mimic the prevailing behavior of the CPSC's selection of the 72 SREM incidents as closely as possible. Querying GPT-4 to think "step by step", "explain your thinking", and "take a deep breath" dramatically increases accuracy, and leaves a trail of thought that can be retrospectively analyzed. An example output of this prompt, as seen in cell AT24:


Step 1: Read the narrative: "INGESTION METAL BALL - 4 YOM INGESTED A SMALL METAL MAGNET BALL AT SCHOOL THIS MORNING, NO TROUBLE BREATHING."

Step 2: Identify the key information: The narrative states that a 4-year-old male ingested a small metal magnet ball at school. It also mentions that there is no trouble breathing.

Step 3: Apply the exclusion criteria:

A. Metal Ball: The narrative explicitly mentions that the ingested object is a "small metal magnet ball." Therefore, this exclusion criterion does not apply.

B. Non Magnetic: There is no information suggesting that the ingested ball is non-magnetic, so this exclusion criterion does not apply.

C. Shape Mismatch: The narrative describes the ingested object as a ""small metal magnet ball,"" which indicates a spherical shape. Therefore, this exclusion criterion does not apply.

D. Bracelet: The narrative does not mention anything about the ingested item being part of a bracelet or similar wrist jewelry. Therefore, this exclusion criterion does not apply.

E. Not Ingested: The narrative explicitly states that the ball was ingested. Therefore, this exclusion criterion does not apply.

F. Uncertain Ingestion: The narrative does not express any uncertainty about the existence of an ingestion. Therefore, this exclusion criterion does not apply.

Step 4: Determine the output: None of the exclusion criteria apply, so the output should be the number 1 to indicate that the ingestion does not match any exclusion criteria.

Final output: 1


Another GPT-4 Prompt can fetch only the output for spreadsheet analysis:


=GPT(CONCATENATE("Return the final from an explanation. The output will either be the single digit '1' or a two word explanation like 'metal ball', and respond with nothing else beside the output. Here is the explanation to retrieve the output from:

", $AT24))


To mitigate any inconsistencies, each narrative underwent evaluations by three separate instances of GPT-4, with the majority verdict dictating the final classification. Though there are many ways to implement this GPT-4 automation, we specifically use the [GPT Quorum method explained here]. An example output of our GPT Quorum run can be found column AR of the "MAGN_Only" sheet

Much like in humans, a high congruence rate among the GPT quorum doesn't automatically imply correctness or fairness in decisions. Nonetheless, uniform decisions indicate that the classification guidelines are more stringent, providing precise and unambiguous criteria. The prompts provided were crafted to mirror the CPSC's dominant behavior observed in the 72 SREM incidents, as marked in the "CPSC" Column AQ of the "MAGN_Only" sheet. Out of the 783 Prematches reviewed, 606 decisions had complete agreement within the GPT Quorum. Utilizing the same prompts should yield a comparable 82% consensus ratio; however, the specific cases with dissent will differ.

3. Dissecting the Data: Understanding the 72 Identified Incidents


The dataset labeled "SREM" - which can be seen within the MAGN sheet of the [2022 Neiss Data] - was compiled using the aforementioned method. This dataset revealed 72 incidents, yet there is a discrepancy when comparing it to the 72 incidents recognized by the CPSC in 2012. This variance arises from irregularities in the CPSC's selection behavior and public NEISS data redactions. A more meticulous approach from CPSC's Epidemiology in 2012 would have pinpointed 75 incidents as SREMs, based on their prevailing selection pattern demonstrated. We've taken the liberty to identify these inconsistent NEISS entries, providing a rationale for each.


Cases with Identification Issues due to NEISS Redaction:


Out of the 72 incidents acknowledged by the CPSC in 2012, three couldn't be confirmed as SREM from the public NEISS data because of redactions:


111219064 12/9/2011 6YOM TO ED,FATHER STATES PT.SWALLOWED 2 SM. *** 1.5 CM.,FATHER THINKS THEY WERE STUCK TOGETHER. DX; INGESTION F.B.

Rationale for inconsistency: Erroneous redaction. The redacted text is "RARE EARTH MAGNETS", which shouldn't have been omitted as no specific brand is indicated. A 1.5cm object suggests a marble-sized item, which might be large for an SREM but is not inconsistent with typical selection.


111250664 12/10/2011 10YOF HX OF PICA, INGESTED 10- 15 SMALL MAGNETIC "***" NO SXDX FB INGESTION

Rationale for inconsistency: Redacted brand, making identification not possible for the public.


111250790 12/22/2011 11 YOF SWALLOWED 3 *** MAGNETSDX: INGESTED 3 *** MAGNETS

Rationale for inconsistency: Redacted brand, making identification not possible for the public.


Cases Incorrectly Identified as SREM by CPSC:


Out of the 72 incidents acknowledged by the CPSC in 2012, two should not have been identified as SREM based on prevailing selection patterns:


110105487 12/14/2010 23MOM SWALLOWED A BRACELET WITH MAGNETIC BEADS, INGESTION OFA FOREIGN BODY, TRANSFERRED

Rationale for inconsistency: - Conflicts with 110740743: While the CPSC marked this incident as SREM, other instances with magnetic bead bracelets were not. Furthermore, the 2012 Rulemaking explicitly doesn't apply to magnetic jewelry.


110530920 3/12/2011 7YOM PUT 2 SMALL ROUND MAGNETS UP EACH NOSTRIL, MAGNETS ARE STUCK TOGETHER AT SEPTUM; FB NOSE

Rationale for inconsistency: Conflicts with 111049138. Here, the magnets weren't ingested, making the CPSC's SREM identification inconsistent.


110506569 4/23/2011 6YOF PUT 2 MAGNETIC BEADS INTO RT NOSTRIL, 1ST X-RAY SHOWED THEM ONFLOOR OF RT MAXILLARY SINUS, REPEAT X-RAY DID NOT. DX - INGESTED F.B.

Rationale for inconsistency: Conflicts with 111049138. Here, the magnets weren't ingested, making the CPSC's SREM identification inconsistent.


Overlooked Cases that Should have been Identified as SREM:


Out of all Magnet Ingestion and Aspiration data from 2009-2011, five incidents should have been identified as SREM, based on prevailing selection patterns:


91229360 12/11/2009 ACC SWALLOWED NON MAGNATIC MARBLE VS ROUND MAGNET, NO DROOLING/SOB>>FB INGESTION

Rationale for inconsistency: This incident wasn't identified by the CPSC as SREM, but it should have been based on the round magnet description.


100307204 2/26/2010 9YOF SWALLOWED EITHER A SMALL ROUND MAGNET OR A BALL BEARING FB INGESTION

Rationale for inconsistency: The CPSC didn't classify this as SREM, yet the potential round magnet and ball bearing descriptions suggest they should have.


100342646 3/10/2010 10 MO M INGESTED SEVERAL *** MAGNETIC BEAD TOY;DX INGESTED OBJECTS

Rationale for inconsistency: Redaction is present. The magnetic bead is identifiable, but potential exclusionary details are obscured.


100451620 4/20/2010 11YOF INGESTED 2 ROUND BALL MAGNETS, FROM ARTIFICIAL TONGUE RING, ATSCHOOL. DX; MAGNET INGESTION X 2

Rationale for inconsistency: Conflicts with 90122469. The CPSC did not identify this incident as SREM, but given the use of ball magnets in a tongue ring.


111207831 11/16/2011 13 YOF SWALLOWED MAGNETIC TONGUE PIERCING.DX: FB STOMACH.

Rationale for inconsistency: Conflicts with 90122469. This entry should have been identified as SREM by the CPSC due to the piercing.


Our capacity to deconstruct and scrutinize the CPSC's SREM classification relied on having detailed knowledge of the 72 incidents from the 2012 Magnet Set rulemaking. Such detailed data isn't customarily available in CPSC regulations. This accentuates the need for comprehensive transparency in incident categorization to uphold data veracity and reliability.

4. Conclusion


This examination presents a comprehensive analysis of the Consumer Product Safety Commission's (CPSC) methodology for the identification of incidents related to Spherical Rare Earth Magnets (SREM), shedding light on its apparent inconsistencies and procedural ambiguities. It further proposes an AI-augmented framework for refining the existing system to enable more standardized and replicable results. The objective has not been to scrutinize the accuracy of the CPSC’s criteria for SREM identification but to present a repeatable SREM identification method based on their revealed strategy from 2009-2011 selection.

The introduction of an AI-augmented approach stands as a promising solution to mitigate subjectivity and enhance reproducibility in injury estimation methods. The efficacy of our GPT-4-based process in categorizing incidents demonstrates the potential role of artificial intelligence in refining, standardizing, and possibly automating regulatory processes. 

It is critical to emphasize that our presentation of the CPSC's blueprint is precisely that—a blueprint—and while it offers a detailed breakdown of the CPSC's approach, it does not make any assertions regarding the fidelity of their method in accurately identifying SREMs. Questions of efficacy, accuracy and validity of CPSC's SREM identification method are not within the scope of this study, but are addressed elsewhere:  [2022 vs 2012 Scope Comparison URL] and [SREM Dataset Limitations].

The implementation of the SREM identification blueprint on public NEISS datasets indicates an observable discrepancy from the unredacted NEISS data that is available to the CPSC. This underscores that the lack of full transparency and uniformity in the CPSC’s identification process poses challenges to data integrity and hampers the public's capacity to independently verify or contest CPSC estimates. This is compounded by inconsistencies revealed through our analysis, ranging from issues of unnecessary (non-brand-name) redaction in the National Electronic Injury Surveillance System (NEISS) data to irregularities in classification and selection behavior. While the CPSC identified 72 incidents, a more discerning application of their stated criteria could have, theoretically, identified 75 incidents. This differential, although seemingly minimal, carries significant weight when translated into monetary terms and safety considerations.

In the future, we hope that the CPSC not only lists the specific In-Scope NEISS incidents - those considered for the societal benefits of all regulatory analysis - but also that CPSC themselves use publicly available AI technology to perform subjective incident selection with full prompting parameters revealed, so as to avoid any hidden CPSC staff selection bias.

The CPSC's goals in the realm of injury estimation undoubtedly have merit, but as we journey deeper into an age underscored by the role of data, it becomes imperative to bolster these methodologies with the precision, objectivity, and consistency that technology can provide. As we strive to safeguard our societies from potential hazards, it's pivotal that the strategies we employ are not just comprehensive but also transparent, replicable, and resistant to ambiguity.


Comments, or suggestions on improvements to future iterations of the CPSC's SREM blueprint are welcomed. Email: outreach (@) magnetsafety.org


Sign up for notifications here