A New Drug-Shelf Arrangement for Reducing Medication Errors using Data Mining : A Case Study

Medication errors are common, fatal, costly but preventable. Location of drugs on the shelves and wrong drug names in prescriptions can cause errors during dispensing process. Therefore, a good drug-shelf arrangement system in pharmacies is crucial for preventing medication errors, increasing patient’s safety, evaluating pharmacy performance, and improving patient outcomes. The main purpose of this study to suggest a new drug-shelf arrangement for the pharmacy to prevent wrong drug selection from shelves by the pharmacist. The study proposes an integrated structure with three-stage data mining method using patient prescription records in database. In the first stage, drugs on prescriptions were clustered depending on the Anatomical Therapeutic Chemical (ATC) classification system to determine associations of drug utilizations. In the second stage association rule mining (ARM), well-known data mining technique, was applied to obtain frequent association rules between drugs which tend to be purchased together. In the third stage, the generated rules from ARM were used in multidimensional scaling (MDS) analysis to create a map displaying the relative location of drug groups on pharmacy shelves. The results of study showed that data mining is a valuable and very efficient tool which provides a basis for potential future investigation to enhance patient safety.


Introduction
The number and variety of available medications increased due to appearance of different types and levels of illnesses.Drug types can range from common and ones with low price, such as antibiotics, to specific and costly ones, such as cancer drugs.Unfortunately, this increase in diversity of the pharmaceutical industry has also raised errors associated with medications [1].According to National Coordinating Council for Medication Error Reporting and Prevention (NCC MERP) "medication error is an incident that can be prevented which may result in or drive to improper medication use or patient harm [2]."A practical guide by World Health Organization (WHO) reported that medication errors can cause therapeutic failure and side effects due to adverse drug reactions, also wasting time and available resources [3].According to report by U.S. Food and Drug Administration (FDA) occurrence of medication errors has several reasons such as poor handwriting, similar named drugs, packaging design defects, mistakes in dosing units, and non-conducive work environment or staffing issues [4].
Especially, environmental conditions in pharmacy such as the location of drugs on the shelves and of drug names in prescriptions can enhance the error probability.[5,6].Joint Commission on Accreditation of Healthcare Organizations (JCAHO) and Institute for Safe Medication Practices (ISMP) has suggested that medications with similar names (look-alike) be separated on pharmacy shelves, not be stored side-by side or alphabetically [7,8].Because, many drugs can often similar (sound alike) or appear similar (look alike).Due to the similarities when written or spoken, these drugs have sometimes been mixed up with each other [9].
For example, Celebrex® * (generic name: celecoxib), Cerebyx® (fosphenytoin), Celexa® (citalopram) are drugs with analogous commercial names and during preparation of prescription they can be mismatched.This error can result in serious adverse effects such as decline in mental status and seizure [10].Taxol® (paclitaxel) and Taxotere® (docetaxel) drugs can be given as another example.These drugs can cause in fatal results because they are used for treatment various types of cancer [11].[12] estimate that confusion of drug names is responsible for 10000 patient injuries each year in the US.In previous studies it is reported that medication errors raised from 0.84% to 2.9% [13][14][15][16].In summary, mismatch of any drugs while picking by the pharmacist can result disease progression and complications, decrease in functional abilities and life quality and even in death [17,18].Despite the crucial importance of drug arrangement on patient's health, there is lack of studies addressing this topic.Therefore, this paper aims to provide a new drug-shelf arrangement system for the pharmacies with three-stage Data Mining (DM) method to reduce medication errors during dispensing process.DM aims to supply new, valuable, and potentially useful information from big databases [19,20].DM techniques can be used extensively in hospitals, clinics and pharmacies by healthcare providers to give better and more affordable healthcare services to the patients [21][22][23][24][25][26][27].

Material and Method
The empirical study was conducted at a public pharmacy in İstanbul, Turkey.The pharmacy has a rectangle area of about 75 m².On the shelves, medicines are arranged in alphabetical order.A team of pharmacists and pharmacist technicians who worked there were invited to contribute to this study.They have recorded the medication errors over a 3month period (health care products which can be sold without prescription were excluded the study).During this process, the greater amounts medication errors were detected caused by wrong selection of similar looking drug names from the shelves (about 39% in 102 errors).Therefore, a four-yearly (01/2010-01/2014) data set including 16657 prescriptions of patients were examined to make comprehensive analysis.Thus, seasonal or monthly medication sale effects on analysis are prevented.
The study proposes an integrated structure which consists of three main stages; Anatomical Therapeutic Chemical (ATC) drug classification, data mining, and multi-dimensional scaling.In the first stage, prescriptions were processed depending on Cross Industry Standard Practice for Data Mining (CRISP-DM) methodology.The CRISP-DM methodology helps to understand data mining process to make large data mining projects, less costly, more reliable, and faster [28].Then, the drugs were classified into groups according to ATC drug classification system.
In the second stage, association rule mining (ARM) which is a widely used data mining technique, was performed to discover associations between the drugs using patient prescriptions.ARM is one of the common methods used to reveal hidden relationships among attributes [29][30][31][32][33].The analysis was conducted by common ARM algorithm, Apriori algorithm to explore associated patterns in prescriptions by using SPSS Clementine 12.0 data mining tool [34].The algorithm of the Apriori is given as follows: L1= {large 1-Item sets}; for ( k = 2;  k−1 ≠ ∅; k++) do Ck = apriori-gen (Lk- In the third stage, generated rules from ARM were supported with visual map by using Multidimensional Scale (MDS) technique.Therefore, twodimensional map of drugs was obtained by discovering relationships between 1st level ATC drug groups.Analysis was conducted by the aid of SPSS Statistics version 21.0 software.The general flow chart of study can be summarized as in Figure 1.The detailed information for each stage was explained in the following sub-sections.

Anatomical Therapeutic Chemical (ATC) classification
The Anatomical Therapeutic Chemical (ATC) classification system was developed to support drug utilization research.ATC classification system enables access to drug consumption statistics at international level.ATC classification is hierarchical and includes five different levels from the most general (first level) to most specific (fifth level).A complete ATC code includes large information regarding the drug's chemical and therapeutical characteristics and thus too specific to be classified based on hundreds examples.Therefore, we concentrated on the 1st level of the ATC classification code that shows the anatomical main group and presented by one capital letter as demonstrated in Table 1 [35,36].

Data mining
Daily life problems in engineering, economics, social and medical sciences accumulates huge data.Mathematical methods must be applied to extract valuable information from this data sets which is critical in decision making problems.In previous studies different mathematical tools, such as machine learning, data mining, data analysis, soft set and fuzzy soft set theory etc. are applied successfully [37][38][39][40][41][42][43].
Data mining includes many different techniques to achieve various types of information.ARM is a wellrecognized technique in data mining to discover interesting correlations, associations or casual structures among sets of items in the transaction databases.An association rule is expressed of the form X⇒Y, where X and Y are disjoint item sets, i.e., X∩Y=∅.X and Y are called antecedent of rule and consequent of rule, respectively.The success of an association rule can be defined in terms of its support, confidence, and lift values [19].
• Support of a rule is described as the percent ratio of records that contain X∪Y in the database.The support of a rule X⇒Y can be expressed as follows: • Confidence of a rule is the percentage or ratio of transactions that include X∪Y to the total number of records that includes X.The confidence of the rule X⇒Y is expressed as below: where s and c are the support and confidence value, respectively,  is the number of transactions and N is the total number of transactions.
The lift (also called interest) of a rule is used to discover interesting patterns and measure how much Y is dependent on X. Lift greater than 1 means that the item X and the item Y tend to happen together more often would be estimated by random chance.Similarly, lift smaller than 1 shows that the item X and item Y are bought together unlikely than would be estimated by random chance [44].

Multi-dimensional scaling
Multi-dimensional scaling (MDS) is a family of statistical techniques for investigating the structure of (dis)similarity data as distances in a lowdimensional space in order to enable the data attainable to visual investigation.MDS creates a map displaying the relative positions of a number of items.
In other words, points that are closer together on the spatial map show similar objects while those that are further apart show dissimilar one [45][46][47].
The accuracy of a MDS map can be measured with stress value.Stress shows the correlation between the input proximities and the output distances in the MDS map.Stress values lays between zero and one.Kruskal's stress function is used for defining of a model's success and is expressed with the following equation [48]: where δij is the value of the proximities among items i and j, and dij is the spatial distance.
MDS map perfectly fits the input data, if the stress value is zero.This means the smaller the stress value, the better the model agrees with the input data.Even though there is restriction related with stress is tolerance, the rule was used that a value ≤ 0.1 is excellent and anything ≥ 0.15 is not tolerable or unacceptable [49].

Drug classification
In the pharmacy database, the prescription records of patients has been collected and stored.The prescriptions are transferred from the Social Security Institution (SSI) system or are entered manually.The raw data set was transferred from pharmacy database to MS Excel.First, missing prescription information were identified and extracted from data set.Then, drugs in prescriptions were classified according to ATC classification system by using information from Turkish Medicines and Medical Device Agency [50].The distribution of drugs by 1st level ATC groups was shown in Figure 2. In order to determine which drug groups are sold together to the same patient, a pivot table was also constructed.Therefore, a list of prescription transactions was created.It has two dimensions where columns represent ATC drug groups and the rows represent patients.These transactions were then converted to tabular data format; each record represents a separate transaction, with as many True/False flag fields.Because, Apriori algorithm can be built with tabular data format.

Generation of association rules
The data mining software, SPSS Clementine 12.0 was used to generate association rules between drug groups.SPSS Clementine runs using of icons, which represent operations and are often referred as nodes.The nodes are linked together on stream canvas to generate models, graphs and other outputs [51].
Apriori algorithm is based on three main parameters which pre-determined by the user; min antecedent support, max rule confidence and max number of antecedents.In order to obtain more rules, the values of min antecedent support, max rule confidence and max number of antecedent were set to 0.5 %, 20 % and 1, respectively.These values at this study were same to that reported in a previous study carried out in Turkey [46].Thus, 45 meaningful rules were selected by the model according to lift ratios (Lift > 1).Because, larger lift means more interesting rules.Table 2 shows a sample part of the generated rules, which gives details on the relationships between drug groups.For instance; the knowledge on patient who purchase "respiratory system" drug also tend to buy "antiinfectives for systematic use" drug at the same time can be represented in association rule as follow: Respiratory system ⇒ Anti-infectives for systemic use; [Rule support = 27%, Confidence =65.7%, Lift =3.6] • Support of 27% means that 27% over all the prescriptions, respiratory system (R) and antiinfectives for systemic use (J) drugs of the ATC classification system were prescribed together.
• Confidence of 65.7% means that 65.7% of the patients who purchased drugs from "R" group also bought from "J" group of the ATC classification system.
• On the other hand, the lift of 3.6 means that patient who purchase drugs from "R" group are 3.6 times more likely to also purchase drugs from "J" group of ATC classification system than randomly chosen patients.
In order to show associations between drug groups, web graph was also produced.The web graph consists of lines which show the strength of the connections between the drug groups.

Visualization of drug groups
Multi-dimensional Scaling (MDS) technique was used to create map displaying the relative positions of drug groups on shelves, given a table of distances between them.For this analysis, confidence values which were obtained by Apriori algorithm were used as distances.First, 14x14 (ATC x ATC) input data matrix was constructed.MDS algorithms uses dissimilarity Euclidian distance matrix.
Dissimilarity Euclidian distance matrix was created from input data matrix to use confidence values as distances, (the values were inversed by subtracting 1 from each of them).For example, two items with a high correlation will have small proximities (1-0.90 = 0.1), which yields in a closer visualization on the map [52].Therefore, spatial representation of relationships of all drug groups are represented in two dimensional space using MDS.The model stress was calculated as 0.091 which shows the model is acceptable.As seen in Figure 4 the 1st level ATC drug groups were divided into four main clusters according to strength of association rules.

Discussion
In this study, 16657 prescriptions, which were written out by 1478 physicians practicing in various health care facilities were analyzed.The average number of medicines per prescription (NMPP) is a prescribing indicator that is relevant to rational use of medicines (RUM), was obtained about 2.84.The NMPP value at this study was similar to that reported in previous studies conducted in Turkey [53].The generated rules from ARM were used in MDS analysis to cluster of the drugs groups on the two dimensional map.All drug clusters are represented in the multidimensional space as shown in Figure 3. Four different clusters were formed according to strength of associations between 1st level ATC drugs groups.While the first cluster consists of antiinfectives for systemic use (ATC code: J) and respiratory system (ATC code: R) drugs [R⇒J; Support=27%, Confidence=65.7%], the second cluster includes musculo-skeletal system (ATC code: M), alimentary tract and metabolism (ATC code: A), and nervous system (ATC code: N) drug groups [M ⇒A; Support = 14.8%,Confidence = 39.8%];[N ⇒A; Support = 10.2%,Confidence = 36.4%].
The drugs for cardiovascular system (ATC code: C) and blood and blood forming organs (ATC code: B) were placed in Cluster 3 [C ⇒B; Support = 4.1%, Confidence = 43%].The last Cluster 4 includes drugs with low associations and sale numbers, such as dermatological (ATC code: D), sensory organs (ATC code: S), systemic hormonal preparations (ATC code: H), genito-urinary system and sex hormones (ATC code: G) anti-parasitic products, insecticides and repellents (ATC code: P), antineoplastic and immunomodulating agents (ATC code: L), and various (ATC code: V).
The results of this study indicated that the model based on ARM can prevent medicine errors by clustering and then arranging drugs according to their associations.This strategy is important because wrong selection of similar looking drug names can be prevented in a systematic perspective.The model based drug-shelf arrangement in a pharmacy also helps to provide the patients' safety and the overall quality of care.

Conclusion
Preventing medication errors is essential for patient safety.However, the current drug arrangement systems in pharmacies cause medication errors.This paper provides a new drug-shelf arrangement system with three stage method by considering drug associations in prescriptions.Therefore, pharmacists can identify, separate, and store drugs on shelves according to their associations and sale trends.The proposed method is also compatible to generate optimal drug arrangement for the all pharmacies as ATC classification system is universal.Finally, this paper is the first study to integrate ATC classification and association rule mining to solve drug-shelf arrangement problems in the literature and it provides a basis for potential future investigation to enhance patient safety.

Figure 1 .
Figure 1.The flowchart of the study

Figure 2 .
Figure 2. Distribution of drugs according to first level of the ATC classification system Figure 3 illustrates all links between drug groups in terms of their associations.The darker lines indicate the strong associations while the discrete lines indicate the weak associations between two drug groups.

Figure 3 .
Figure 3. Web graph of drug groups

Figure 4 .
Figure 4. Two dimensional map of drug groups

Table 1 .
First level ATC classification

Table 2 .
Sample of association rules with their support and confidence values