Definition and Description of Analytics
Data and information are the drivers of quality and efficient performance in many organizations in the modern age of information technology. All sectors, including the healthcare cannot operate at the highest level of effectiveness and efficiency without leveraging data and data processing tools to acquire useful information (Coiera, 2015). The modern reforms in healthcare are geared towards improvement of care of the services offered to patients. Strome (2013) posited that the organizations operating within the health care system claim to be involved in the efforts geared towards quality improvement (QI). However, the reality is that there is still inadequate evidence that the organizations are achieving improvement of their quality in a sustainable manner. The efforts to improve spending on health care are not a guarantee of quality service as revealed in the case of the United States.
Coiera (2015) elucidated the need to ensure that asset in healthcare organizations should be directed to the investments that are better placed to improve the quality provided. Strome (2013) supported investment in health information systems as the primary indicator for quality service provision in the organizations. The author added that relevant information has the potential for changing the organizations and propelling them towards the right direction. The reality is that information systems have the capacity to provide the means for the implementation of evidence-based practice, application of intelligent algorithms, and gathering intelligence about pertinent activities relating to patient care. Lorenzi & Riley (2013) conjectured that the operations of healthcare organizations are becoming data driven, with the practitioners within them having to deal with vast data on a daily basis. With the increase in the data that flows through the organizations every day, it is critical to apply powerful tools for the analysis and reporting of the data in a manner that allows ease of use, relevance, and reliability.
Health informatics presents the field of study that designates the use of sophisticated tools for the analysis, presentation, and application of data in practice for different functions, including decision-making, quality and performance improvement among other areas (Lorenzi & Riley, 2013). In defining the concept of health informatics, it is critical to understand what it is and also what it is not. The concept applies to defining the acquisition, storage, retrieval and use of health information towards the end of improving care provided and the performance of the care providers. On the other hand, the health informatics is not information technology used in health care. While IT entails the utilization of technology in the provision of care, it is not synonymous with health informatics. Rather, health informatics suggests the “science” of the why and how behind health information technology.
Coiera (2015) hypothesized that health informatics is a transforming field connecting IT, communication, and the system for quality improvement and patient safety in care provision. The field entails the application of theories, concepts, and practice to practice towards the end of accomplishing better outcomes in care provision. The field includes the application of health information technology (HIT) in improving care provided to patients through a blend of better efficiency, enhanced quality, and innovative opportunities in addressing the multifaceted needs of the patients. Health informatics come into play in designing, developing, adopting and using IT-based innovations in the formation, management, and delivering care. Health Informatics makes use of various health informatics tools and processes to achieve the goal.
Tools and Techniques of Analytics
Software Analytical Tools
SAS. The analytical tool is used in healthcare organizations as business intelligence tool for analysis, management, presentation, as well as in the access of data. Statistical Analysis System, in full, is a suite of software created by SAS Institute and most suited for advanced analysis of data. Based on the advanced level of the tool, it is used for providing business intelligence, multivariate analyses, predictive analytics and data management (Katal, Wazid & Goudar, 2013). The tool has the capacity for mining, transforming, and retrieving health data from different sources in order to carry out statistical analysis. The steps followed in the analysis are statements which can be executed by the software to present easy to understand presentation of the output of the analysis. The Output Delivery System publishes the analyzed data in Excel, PDF, HTML, or any other format.
SPSS. The tool, Statistical Package for the Social Sciences, is used in the analysis of huge data sets that cannot be effectively analyzed using the basic statistical packages and spreadsheets. However, the software is not only applicable to the social sciences as it has entered other fields, including health sciences. The software program is used in the analysis necessitating the establishment of new relationships in the data as well as predicting the action proceeding action. The package is useful in eliminating time-consuming data preparation and hence creating, manipulating, and distributing insights in a speedy manner (Katal, Wazid, & Goudar, 2013). The tool is the analysis of both logical batched and non-batched data, useful for survey authoring implementation, text analysis, data mining, and collaboration.
SQL. The acronym stands for Structured Query Language. Indeed, this is an analytical tool used to access and manipulate relational databases as opposed to distinct data. The domain specific language is basically a tool for programming and manipulation of RDBMS. The tool manipulates databases through execution of queries, retrieving of data, inserting records, updating records, deleting records, creating new databases, forming new tables, storing procedures, producing views, and setting permissions in the use of the databases. Given the vast health data organized in databases, the tool is useful in its management and manipulation to allow effectiveness and efficiency of operations (Katal, Wazid, & Goudar, 2013). The tool makes work easier in the management of large amounts of data, which is the reality of the modern health care settings.
NLP. In full, Natural language processing is an analytical tool used in the processing of human language into the machine language. The tool has applications in computational linguistics and intelligence analysis, making it critical for decision-making in health care (Croft & Lafferty, 2013). The data collected by individuals is normally in the form of natural language, necessitating processing into a language that both the machine and the user can comprehend. Indeed, this is the basis for the use of NLP in health analytics.
- This is among the most commonly used analytical tools for use in the analysis of health data. The software is preferred, even more than SAS, because over the years, it has become more robust and versatile (Sagiroglu & Sinanc, 2013). The software tool has the potential for handling huge data sets.
MS Excel. The tool is one of the most basic, and hence the most used analytical tool in healthcare and other organizations. Unlike the other tools, it is also the easiest to access as it comes along with the basic Microsoft software such as MS Office. Indeed, Ms. Excel is a Microsoft electronic based spreadsheet (Sagiroglu & Sinanc, 2013). The software is developed with intuitive interfaces, outstanding chart tools, and excellent calculative capabilities.
Statistical Analytical Tools
Descriptive Statistics. Descriptive statistics are data entry models used in the description of the data within a particular study. They provide simplified approaches to quantitative data analysis processes. The advantages of the descriptive statistics models is the ability to provide visual based comparative tools like graphs and charts based on the variables provided (Ott & Longnecker, 2015). Essentially, descriptive statistics are variedly distinguishable from the inferential statistical platforms. The model only describes the information on the data an analyst has captured. As a result, data manipulation ideas are limited to the information available to the analyst varied from the inferential platform that steps into the imaginative result of the data. The descriptive statistics are used in the presentation of the quantitative statistic and description of large and voluminous data simplified into simple easily-managed and easy to understand models. The descriptive statistic tools analyze the data into various forms and frequencies; providing advises likes Means, Medians, Modes, standard deviations and other models of data presentations (Ott & Longnecker, 2015). Descriptive statistical analyst relies on software for its analytical competencies. The software mainly used to provide analytical frequencies are Ms. Excel and the SPSS. They are used for large information system based decision-making and business intelligence. Excel and SPSS are windows based software’s that do not require the knowledge of programming languages of operations
Probability Variation. The probability of an event or outcome is determined based on empirical or theoretical models. The models are either mathematically supported and or based on logic (Ott & Longnecker, 2015). The principal of idealization of frequencies forms the concepts of probability. Variations occur when there is difficulty in the prediction with absolute certainty the outcome of an experiment presented by an analyst when the similar experiments is conducted on several occasions using similar models. The theoretical foundations of probability are based on mathematical models and represented through a logical representation of the facts. As a result, mathematical formations use theoretical approaches to develop comprehensible odds to explain the probabilities of an outcome. Variance in probability plays the role of explaining the expectations within the ranges of random variables and used in several statistical platforms (Ott & Longnecker, 2015). In addition, the descriptive, inferential, as well as hypothetical tests among other models, use variance in their data presentations.
Hypothesis Testing. Hypothesis testing is procedural models used by analysts to develop a basis from which statistical hypotheses are accepted or rejected. It follows that statisticians describe hypothesis tests as procedural statistical tests meant to deduce evidences based on data provided to infer certainty to the quality of variables (Ott & Longnecker, 2015). The model is used in the examination of two non-related hypotheses, where that null hypothesis in the actual sense will be the statement or variable to be tested with the alternative hypotheses representing the desirable or expected variable. Essentially, the null hypotheses are generally considered as non-effective or a statement of no difference since it is the base by which factual information and ideas or position are categorized.
There is a general misconception about statistical hypothesis that the hypothetical tests are usually designed to influence the results of the two hypotheses models. Conversely, it occurs that the null hypothesis takes precedence until there is available and sufficient data based evidences to donate precedent to the alternative hypothesis. Statisticians have developed processes that are formal in nature in the determination of the validity of the null hypothesis (Ott & Longnecker, 2015). The process is developed within four stages. The statements of the null and the alternative hypotheses is the first stage followed by the formulation if an analysis model. Indeed, this process would provide guidelines on how an evaluation would be conducted using the sample data provided based on a single test statistic. The actual data would then be analyzed in various forms, either in proportions, mean scores, or other forms as would be advised in the data analysis plan. The analyzed data would be interpreted based on a decision rule described. In the event the value of the statistical test is unlikely based on the null statistics previously used, then the null hypothesis is rejected.
The Chi-Square Test. The Chi-Square tests are statistical testing models used in the determination of considerable relationship that exists between two categorical variables. The comparison uses the frequencies within an independent nominal and compares it to the values of a different second nominal variable. The data in this statistical testing model is analyzed through contingency tables placing data into rows and columns and hence expressing the frequency of a single variable and the values of the second variable (Ott & Longnecker, 2015). When correlations are identified during the comparisons of various variables based on the data provided it is said in incur to “a low significance chi-square test statistic” while “a high significance chi-square test statistics” is used to represent data of varied results and whose relationship is not ascertained.
The two variables in the Chi-Square statistics provide a relationship between numerical variable and a categorical variable. The models advice on the extent of the difference between the variable supported by the data provided for the test. SPSS data analysis tool and Excel worksheet are used to provide approaches and entry of data to calculate and analyze them to provide information to base on the Chi- Squire testing designs. Chi – Square designs are also used to test association models. The association models are non-parametric tests and based on the variable provided for testing. Analyst statisticians question the ability of some Chi-Square models limitation in the computation of confidence intervals rendering the assurance of the specimen size not promptly accessible (Ott & Longnecker, 2015). One obvious fact in the representation of data through the Chi-Square models is the availability of data for analysis based on the two separate and independent variables. The model provides opportunities for professional and scientific approaches to data analysis intending to get competent results.
Correlations. Correlation is one of the widely utilized analytical designs in statistical analysis. This is because of its user friendliness and its simplified operability with its coefficients, which summarize the relationship that exists within variables (Ott & Longnecker, 2015). Correlation is a statistical model used in the analysis of quantifiable data. The quantified data analyzed through correlations are not applicable to absolutely downright analysis based on variables like gender or favorite color among other examples. SAS is data analysis software developed with data manipulation interfaces and enabled to provide charts and graphs for decision making. The analytical technique falls within the Business applications category (Ott & Longnecker, 2015). The software is mainly used for large information system based on decision-making and business intelligence. The tool is used to make correlation analysis due to its capacity to handle large volumes of data. A statistical variable correlates based on matrix approaches. Therefore, variables are said to relate best on tests involving percentage. With regard to this, 100 % correlation implies that there is a relationship between the variable tested which can lead to the statistician or analyst to come into conclusion that the variables perfectly correlates.
Correlation has the ability to indicate the extent of relationships or closeness that occurs between a pair of variables. The technique can be used to analyze the relationship between height and weight, correlating the possibility of whether taller people are heavier that shorter once. The correlation may appear variedly to show that there exist majority of shorter people are who are heavier than taller people necessitating the need to ascertain the relationship. However, it is important to note that the correlation between two variables tested do not provide similar results when a different form is used. Indeed, the fact that young children and the older generations require more healthcare attention is factual and clearly explains the correlation. However, the frequency and nature of healthcare needs for the variables are different; hence, there could be no relationship.
Multiple Regressions. Multiple regressions is an extension of simple linear statistical analysis techniques commonly used to measure the relationship between endogenous dependent variable with two or more exogenous variables (Ott & Longnecker, 2015). This analytical tool is used mainly to perform three major functions. In that aspect, the multiple regression technique is used to for analyzing forecasting, handling predictions, and causal analysis. Multiple regressions may be utilized to recognize the quality of the impact that the independent valuable have in a dependent exogenous variable. Indeed, this technique is best tested using a sample of questions. The typical question formats are important since the results produced will be more coherent. Questions like what are the quality of connections or relationship amongst measurement and impact, deals and advertisement cost, age, and wage are convenient for this technique. Apart from the correlation, multiple regressions will generally focus on the nature and extent of the relationship between two variables. In addition, multiple regressions make assumption of the existence of a dependence relationship.
The technique can be utilized to speculate about the impacts or effects of changes. The technique helps the researcher to recognize how much the exogenous variable changes when the independent valuable is change. The techniques can be tested using a model question; how much extra additional sample Y do I get for one extra unit of sample X. The Multiple regression technique also predicts patterns and future trends. The multiple regression technique can be utilized to get position estimated. The technique can also be tested through a model of questions, including what will the cost for X be in X month from now? When choosing a model for the multiple regressions, one of the most important attributes is the model fit. The addition of variable to the to the technique guarantees quality and statistical validity.
One-way and two-way Analysis of Variance. Research in the complex field of business, economic, sociology, and other related fields require complex analytical research tools. The one-way and two-way also known as ANOVA analysis of variances is ideally used. The research technique is applied in the comparisons between one or two variables that can instrumentally compare more variables simultaneously. In this case, the researcher will seeks to analyze a single variable in one-way ANOVA while in two-way ANOVA the researcher will explore two factors all at the same time (Ott & Longnecker, 2015).
Data Visualization or Representation Techniques
Data visualization creates a model of data presentation in graphical and pictorial format. As a technique for representation, data visualization enables decision makers to visually present analytics in a bid to grasp not only difficult concepts but also new patterns. The concept of interactive data visualization integrates technology in drawing graphs and charts, changing the manner in which data is presented and processed. According to Khan and Khan (2011), data can be presented in different forms depending on the analytical tool used, either charts or graphs. Among the most common data visualization methods is a bar graph. The method is commonly used in the presentation of discrete data. In the presentation, there are horizontal bars, with a vertical length representing the data values as obtained through analysis. The method can also be useful in resenting single series of data as well as associated points of data in a series. The graph can also be used in indicating trends, like increase in value over time. The figure below is a good example.
Besides a graph, data can be presented in the form of a chart. The visualization mode is in the model of a circle with divisions as sectors to describe the percentage of the whole concept. The model has controls that are used in determining the data wedge’s size in comparison to others. The commonness of characteristics or features determines the presentation of the wedge. Diverse points of data are labeled depending on the wedges. The figure below shows the example of a standard chart.
Use of Analytics in Healthcare
In Healthcare Quality Improvement
Through analytics, health care implements enterprise data foundation as a fundamental step in establishing quality improvement initiatives. Coiera (2015) noted that the ability to extract actionable and meaningful insights is foundational to improving the patients’ satisfaction, outcome, and quality initiatives. Hospitals have made significant strides to harness the use of big data, thereby enhancing the operational and quality direction. The integration of information technology systems on the productivity of the health care system is anchored on the desire to provide patient safety. There are several factors that underpin the increased demand for big data applications in the health care sector. The growth in the healthcare sector influence health care providers to dissuade any forms of overutilization. Therefore, actors and stakeholders in the healthcare industry have to engage more in the process of amassing and transmitting information (Lorenzi & Riley, 2013). In fact, this realization is critical in the proliferation of big data applications in the healthcare sector is the transformation perceived in clinical functions.
In Healthcare Performance Improvement
Hospitals continue to use analytics to pursue the initiatives for performance improvement and thus work on the patient experience, improve clinical outcome, and reduce on the operational costs in the organization. The execution of the efforts capitalizes of core competencies and resource capabilities to sustain sustainable improvement (Strome, 2013). The primary goal of data analytics is to prioritize the efforts for performance improvement in a bid to achieve momentum within the spectrum of performance measurement. While healthcare practitioners have conventionally used their expert judgments to make medical decisions, particularly regarding treatment, strategic efforts must be addressed to incline the processes of health care towards evidence-based medicine (Lorenzi & Riley, 2013). Evidence-based medicine requires the healthcare professionals to methodically assess clinical data and make medical decisions based on an assessment of the available information.
In Healthcare Clinical Decision Making
The healthcare system utilizes the analytics as well as organizational leadership to make clinical decisions that leverages the efficiencies of the healthcare system. With multiple issues of management that allow users to build on electronic data and consistently employ it for the delivery of quality healthcare. Analytics in healthcare possess the capacity and potential to completely revolutionize the health care sector (Lorenzi & Riley, 2013). This has greatly impacted the delivery of quality and safe healthcare to patients by ensuring that the stakeholders in the health care industry are dedicated to the recognition and the creation of brand new treatments and perspectives in the delivery of quality health care. The insights and knowledge derived from big data in the healthcare sector are also very supportive of processes such as research and development in healthcare as well as advancements in medicine.
In Healthcare Administrative Decision Making
In recent years there are many innovations that have been experienced in the health care sector through administrative decision making. In order to support the innovation of wearable technology, it is inevitable that experts in the health care sector will require big data collected in real time. According to Coiera (2015), an increased number of consumers in the health care sector are interested in the acquisition of smart medical devices and health wearable. The utilization of the insights and knowledge of information systems and analytics in the healthcare sector has led to an increased focus on ensuring that health care professionals who respond to the medical needs of patients are well qualified. It is very important for the medical personnel who are entrusted with their lives and health of patients to have impeccable performance and ethical records to ensure the best patient outcomes (Lorenzi & Riley, 2013). By analyzing big data, it becomes much easier to identify the skills, performance records, and abilities of health care professionals rather than their career designations.
In Healthcare Fraud Detection
Analytics in healthcare play a significant role in fraud detection. Predictive analytics detect credit card fraud as a pathway for saving the healthcare a greater cost. Similarly, account auditing can carefully be done through analytics to reveal suspicious policyholders and providers. Therefore, analytics allow for a practical approach necessary for the scrutiny of auditing performances (Coiera, 2015). These analytics include predictive patterns that create and capture value from big data in the healthcare sector and thus prevent fraud. Big data in the healthcare industry has provided healthcare professionals with the insight and knowledge that they require to inform patients on how to lead healthy lifestyles.
In Public Health
Predictive analytics leverage the epidemiological processes, predicting the outbreak of diseases, chronic diseases, and other information relevant to public health. It is a well-known fact that health care institutions are tasked with the responsibility of presenting clients with safe and quality health care. By analyzing the chronologically arranged historical information about individual patients, health care professionals are able to offer patients appropriate and timely treatment (Coiera, 2015). Insights gathered from big data enables health care professionals to abide by the set standards of health care and engage in coordinated efforts to ensure the delivery of safe and quality health care. Such collaborative efforts ensure that relevant healthcare personnel are able to access patient information and thus prevent duplication of efforts.
In Pharmaceutical Industry and Discoveries
Big data has revolutionized pharmaceutical processes, compelling data growth generation, research and development processes and identifying new potential drug candidate. In fact, this narrows down to the development of approved effective and reimbursed medicines in a quicker way. Lorenzi & Riley (2013) hypothesized that biological process modeling has shaped the sophistication of pharmaceutical products. The attempt to leverage diversity available in clinical and molecular data analytics help in identifying clinical trials in testing the efficiency of a pharmaceutical product.
In Human Genomic data analysis and Personalized Medicine
Recent advances in genomes show the emergence of biological and information system have achieved a precise modeling genomics. Technologies have led to the generation of DNA sequencing and other genomics. The need for the management, analysis, and interpretation of genomic data is a pathway in medicine and healthcare, leading to the solutions that address the problems related to health information management (Lorenzi & Riley, 2013). Medication and other strategic direction for treatments and diagnosis are hinged on genomics to maintain the health of the people.
Role of Data Quality in Healthcare Analytics
Initiatives of health care quality improvements are governed by the principles of data analysis. The increased spending on the healthcare system is a measure of the use of health information technology that creates model capabilities that can deliver evidence-based health care. Therefore, data analytics shape the process of improving quality through the intelligent algorithm. As a result, most hospitals and healthcare practitioners reduce the instances of medical error. Analytics and interfaces allow the health care to deal with vast amounts of data, and serve a diverse client base. As such, the integrated health care system improves quality initiatives within the dimension of appropriate system platforms to ensure efficiency (Lorenzi & Riley, 2013). Hospitals and other health care providers use data quality within the specific operating system as well as other stable version of the open source applications. To this effect, hospitals are capable of using the underlying databases in analyzing and processing the collected data.
The interface is one of the most crucial elements in the creation of a user-friendly product. Such a data analytics ensures the degree to which the user will be capable of blending with the finished product. The data analysis on its part is similarly highly determined by the interface for which it is presented with. Therefore, the health care systems use the software applications that make the analysis and more so the interpretation of the findings an enjoyable venture. It is evident that visualization and the presentation of the data analysis shape the pathway for quality improvement (Coiera, 2015). Accordingly, the health care analytics usually make use of friendly interfaces, to ensure the satisfaction of clients.
Summary
Healthcare analytics and data mechanics present one of the crucial undertakings of the healthcare sector. It is paramount that towards retaining quality improvement initiative, healthcare systems must build an integrated analytic system involving the expansion of the healthcare service, including the collection of data by the application of web analytics and then combining it with other operational systems data. The healthcare analytics portend the initiatives in the quality improvement creativities. In fact, this requires much refining in order to ensure the success of the project (Coiera, 2015). Due to increasing data collection and analysis, there is the need to expand the existing information technology infrastructure so as to satisfy quality needs of the health care organization.
Healthcare systems have prospects in data analytics. Therefore, this is aligned to the desire to venture into corporate data mining that is geared at aiding business processes regarding prevailing trends in the consumer markets. As such, health care systems are extensively involved in the data analytic process, and more so with a bias in the information that is obtained from the web. Therefore, the project involves massive redesigning of the software and the existing hardware features (Coiera, 2015). In essence, it is plausible to note that information analytics embraces the integrated mechanism to incorporate new software applications into the computer systems that are operational.
In that case, the hardware needs to be upgraded for adequate handling of vast data that the hospitals are bound to deal with. In addition, matters of security of the data come into play, with the security of the computer systems and the data contained therein being of paramount importance. In order to minimize costs that could be incurred in trying to gather the prerequisite resources, including the requirement for massive storage, and security concerns, the cloud computing concept has been regarded as a viable solution (Lorenzi & Riley, 2013). As a result, the question of cloud computing, thereby promises effective quality improvement initiatives for the adequate economic venture as well as a technological advancement to the prevailing IT infrastructure. The trends in the future and the direction of analytics map a consideration regarding the corporations that offer cloud computing services, as well as considering whether “going it alone,” presents the best option.