Big Data Management in Health Care

Subject: Public Health
Pages: 10
Words: 2553
Reading time:
11 min
Study level: Master


At the present, the healthcare system is facing many new challenges that require interdisciplinary solutions. It appears that without automating certain processes and making data-drive decisions, the system can stagger, depriving many people of medical help. As of now, the connectivity between health providers is insufficient, which leads to mishaps in coverage and prescription. In many countries, health insurance price formation is out of control, driving people to make out-of-pocket expenses to afford getting help. Personalized patient care is often off-limits due to the high workload that an average health worker has to deal with.

Lastly, the healthcare system often resorts to taking reactive measures and treating the symptoms instead of predicting outbursts and epidemics. In this paper, it is argued that big data management has its place in the healthcare system but not without certain limitations that have yet to be eliminated.

Background of the Industry

The healthcare system consists of many providers, facilities, and institutions that often fail to effectively communicate with each other. According to Tang, Plasek, and Bates (2018), at the dawn of the health data economy, data management can and should be rethought and reinvented. The researchers report a significant progress in current medical big data trends. For instance, the European Commission projects that the value of the European Union data economy would jump to US $860 billion by 2020, from US $331 billion in 2015 (almost a 180% growth).

Yet, the current data resources are often siloed, which makes it impossible to advance the healthcare data economy in the current clinical conditions. Besides, artificial intelligence (AI) and machine learning (ML) models come with their own set of limitations. Thus, there is a need for a comprehensive SWOT analysis, highlight strengths, weaknesses, opportunities, and threats.

Strengths of Big Data in Health Care

Diagnosing and predictive modeling are one of the most promising areas of big data, machine learning, and artificial intelligence applications. Some countries have already been using big data analytics to optimize their healthcare systems. On the frontier of digital revolution in health care is China that has been launching successful predictive modeling projects in the last few years. Even though the scope of the effect has so far been confined to the local level, the consistently positive findings hint at the possibility of a wider use. Walker (2018) writes that Health Connect, an organization with the structure of a large medical insurance provider, was able to track the key trends on chronic disease management. Before Health Connect intervened, the local Social Health Insurance (SHI) had experienced an annual increase of 20% in medical expenses. After being introduced to big data methods, the SHI saw a 1.2% decrease in the first few years.

As for diagnosing, big data can be used to make the process more precise and time-efficient. Probably, one of the most well-developed software in this field is IBM Watson that employs vast databases to make diagnoses in as little time as 20 minutes. Apart from routine cases, IBM Watson was able to handle rare diseases that could not be identified by the medical staff. Rare diseases are especially life-threatening because they go unnoticed for extended periods of time and later result in complications. IBM Watson was able to crack the puzzles that were some cases that as many as 40 interdisciplinary specialists were not able to solve.


Despite the promising prospects that BD, ML, and AI have when it comes to making predictions, they should be used with caution. As of now, the practical implications of big data management for diagnosing and predicting are not exactly clear. As Raghupathi and Raghupathi (2014) state, not a single model shows 100% precision. Theoretically, a precision of 80-90% may appear sufficient, but in practice, it will mean conditions gone unnoticed and dismissed (Amarasingham, Patzer, Huesch, Nguyen, & Xie, 2014). Surely, health workers can check everything twice, but then it is unclear why technology is even needed in the first place if it does not facilitate the workflow.

Another issue that may arise is health workers’ low computer literacy. Even now when many medical facilities buy more electronic equipment and medical software, the staff struggle with handling it and dedicating extra time to learning the proper use (Amarasingham et al., 2014). In this case, implementation might require training sessions for the staff if not in-house specialists able to fix the new technologies when necessary. This may mean additional costs for medical facilities, which again, mitigates the positive effect of automated diagnosing and predictive modeling.


Patient Data Sharing for Learning and Collaboration

Medical data takes various forms: genetic, genomic, proteomic, clinical, imaging, public health, and patient-centered, just to name a few. Depending on the current needs of a facility, it can choose to access the necessary data for conducting independent research or making data-driven decisions. Secondly, data sharing benefits every party involved: the patient, the health worker, and the third-party provider. For the first two participants within the healthcare system, the benefits of sharing data are quite obvious. It converts all data into a unified format and stores it carefully across the databases, which prevents the loss, discontinuity, and incompleteness of information (Holmgren, Patel, & Adler-Milstein, 2017).

Patients can be sure that their medical information is accessible by their key health providers so that all the particularities are considered when making a diagnosis or putting together a treatment plan. Health providers, in turn, can control what is being prescribed and how patients are treated over extended periods of time due to the shared system of medical health records.

A prime example of how streamlined patient data sharing can not only improve the functioning of the system but literally save lives is opioid prescription. In some countries such as the United States, one can speak of a full-fledged opioid addiction epidemics. Even though opioid medications are typically seen as highly effective, they have many side effects that patients often choose to dismiss in search of relief from pain. Lewis, Kohtz, Emmerling, Fisher, and Mcgarvey (2018) report that in the US, opioid overdose and opioid use complications take the lives of as many as 15,000 people every year. What is even more concerning is that as the Centers for Disease Control state, the number of opioid prescriptions and overdose deaths has quadrupled over the span of the last twenty years.

What often happens is that a patient abuses opioid medication because smaller doses no longer provide a proper relief from painful sensations. To receive more medication, those patients contact more than one providers to see if they can prescribe them more medication. This is a considerable issue arising from the lack of connectivity and communication between providers. However, the proper use and management of big data sharing can help to tackle this issue. Providers will be able to cross-check different databases and see if a particular patient is already on pills that have been found to be toxic and life-endangering when taken beyond the norm.

Another possible use of streamlined patient data sharing is for research. It is true that raw journal articles are publicly accessible on a plethora of portals such as PubMed and NCBI. Their findings can and do serve good: they provide a solid theoretical foundation for evidence-based interventions and medical process changes. However, these articles rarely if ever share data products, i.e. models, intermediate results, training corpora, scans, images, and other types of medical data. Sheller, Reina, Edwards, Martin, and Bakas (2018) show that in order to train a deep neural network with high precision, researchers might need a set of as many as 10,000 labeled images. It is obvious that for the sake of time and cost efficiency, they would have to rely on shared databases.

Health Insurance Price Transparency

Another use of big data in healthcare aims at mitigating the costs of medical services, especially for the patients. Reinhardt (2014) explains that in economically developed countries, universal health care is seen as standard. Patients only end up paying quite modest parts of the medical bill at the time health care is used. Besides, health workers and medical entities such as clinics and hospitals are paid on uniform common fee schedules.

Other countries, however, are from this standard of care: among them are both developed countries such as the United States and developing such as India and China. When universal health care is underdeveloped or non-existent, the system might lack patient-orientedness, consistency, transparency and accountability. For example, in the US, private health insurers are free to negotiate prices with every medical entity or health worker (Hilsenrath, Eakin, & Fischer, 2015). As a result, it is not uncommon for patients to be paying out-of-pocket because their insurance plans do not cover their entire medical bills.

Sinaiko and Rosenthal (2016) argue that patients of the new generation are already more engaged in the decision-making process regarding their medical expenses. The researchers show that the proponents of transparency in the medical systems hope for more accessible health care price information. Hilsenrath et al. (2015) show that big data management and machine learning can modernise health insurance price formation. Bresnick (2018) describes an application in making that will bring more accessible price data to patients. Previously, only the payer had all the information available and would only let patients see parts of it. With big data analytics, health care might approach the new frontier where patients will gain more agency and control over their finances.

Patient Assistance

Another great challenge that healthcare is facing today is the shortage of health workers while the number of patients in need of medical help and the burden of disease undergo a significant growth. As a result, the existing medical staff is barely able to handle the many requests and appointments, which makes the wait times unacceptably long. It is evident that handling the vast amount of financial data to give customised advice would be impossible manually; hence, it makes sense to do so through innovative digital technologies. At the present, new technical development are able to mimic personal patient assistance, processing requests in text and voice forms, forming a response, and dissipating useful information.

A chatbot is a prime example of a technology that allows to provide automated personal assistance to patients. It is a piece of conversational software that uses natural language processing (NLP) to interact with users in a human-like manner. The end goal is to make a user feel at ease conversing with a digital representative of a company or service (Ni, Lu, Liu & Liu, 2017). Technologies that employ natural language processing (NLP) are now increasingly used in the healthcare sector as well.

Recent statistics show that natural language processing for industry use has been a rising trend across the globe. According to Pennic (2019), in the healthcare market alone, chatbots are expected to capitalise at $498 mln at a compound yearly growth rate of 26.29% in the next ten years. Chatbots in medicine have sought a greater acceptance due to the popularity of smart devices and Internet availability (Hirschberg & Manning, 2015). Apart from that, chatbots show promising prospects when it comes to cost optimization and customer experience enhancement.

Firstly, some conditions need ongoing monitoring, and in the case of an emergency, a matter of minutes can decide the survivability of a patient. A chatbot can substitute a health worker outside the working hours of the latter. Apart from that, it can be a mediator that turns to action, i.e. forwarding the matter to an emergency worker, if a situation derails (Velupillai, Mowery, South, Kvist & Dalianis, 2015). Furthermore, the growing population and the increased burden of disease both require logging vast amounts of data (Fabian, Ermakova & Junghanns, 2015). A health worker might struggle navigating databases on their own in an attempt to locate a particular profile. As Tirinato et al. (2105) point out, medical database management could benefit from an AI-based mechanism such as a chatbot that would be responsible for internal record search and keeping.

Lastly, chatbots could take over the responsibility of handling appointments, which, given the growing heart patient population, would lighten the workload. An automated system that knows precisely how often a patient needs to see the doctor and the working hours of the latter would benefit all parties involved. According to Pennic (2019), AI-enabled tools may help tackle the issue of missed appointments and save as much as $200 per each.


One of the main threats is impeded streamlining of medical data. This issue is unfortunate as it prevents health providers from seizing many opportunities that it presents. For all its advantages, streamlined patient data sharing has its downsides and shortcomings. Probably, the greatest concern is the anonymity and privacy of the data. Using patient information for unethical purposes in a way that excludes the patient from making decisions violates the main medical ethics principles. El Emam, Rodgers, and Malin (2015) argue that there must be ways to make data sharing as ethical as possible and one of them is complete anonymization before the use and transfer of data. Anonymized health data is no longer considered private under the jurisdiction of the European Union and the United States.

Another threat relates to the opportunities that big data presents in the area of medical insurance transparency. The big change might take a lot of time: one may readily imagine the issues that might arise during the process of data collection. It would only be natural for third-party providers to be reluctant to permit data scraping since it would often mean losing their competitive advantage. It becomes obvious that the main threats regarding big data emerge because the unwillingness of some parties involved to collaborate and adopt big data models.

On the other hand, patients may not be open to adopting new technologies such as chatbots, and the evidence regarding their attitude is quite conflicting. For instance, Bleustein et al. (2015) state that wait times, availability and customization of treatment plans were key elements of patient satisfaction. Patients may feel less neglected once they are aware that all their data is carefully logged by an algorithm and predictions and health decisions are made based on detected patterns.

However, chatbots may rightfully frustrate elderly patients who might be not exactly technology-savvy (Bleustein et al., 2015). Besides, as any other technology, chatbots are unlikely to be devoid of bugs and mishaps. This might lead to undesirable situations when a patient is trying to receive information, but the application lags or does not process requests properly.


In the digital age, it is essential that medical facilities and institutions are able to make the best use of patient data to improve work processes and subsequently, care quality and access. Current demographic tendencies, namely, the growing and aging patient populations, put an extra strain on medical workers, especially given the shortage of the latter. Streamlined patient data sharing can help health providers to cross-reference and check patient information.

Aside from that, accessible health data can be put to good use in research. Big data has the potential to tackle the financial aspect of patient care and make the system more transparent. Patients may also benefit from automated patient assistance that can be realized through chatbots. Lastly, big data management allows for making predictions and diagnoses even when it comes to quite challenging cases.

Reference List

Amarasingham, R, Patzer, RE, Huesch, M, Nguyen, NQ, & Xie, B 2014, ‘Implementing electronic health care predictive analytics: considerations and challenges’, Health Affairs, vol. 33, no. 7, pp. 1148-1154.

Bleustein, C, Rothschild, DB, Valen, A, Valatis, E, Schweitzer, L & Jones, R 2014, ‘Wait times, patient satisfaction scores, and the perception of care’, The American Journal of Managed Care, vol. 20, no. 5, pp. 393-400.

Bresnick, J 2018, Can big data solve the health insurance transparency problem?. Web.

El Emam, K, Rodgers, S, & Malin, B 2015, ‘Anonymising and sharing individual patient data’, BMJ, vol. 350, p. h1139.

Fabian, B, Ermakova, T & Junghanns, P 2015, ‘Collaborative and secure sharing of healthcare data in multi-clouds’, Information Systems, vol. 48, pp. 132-150.

Hilsenrath, P, Eakin, C, & Fischer, K 2015, ‘Price-transparency and cost accounting: challenges for health care organizations in the consumer-driven era’, INQUIRY: The Journal of Health Care Organization, Provision, and Financing, vol. 52, p.0046958015574981.

Hirschberg, J & Manning, CD 2015, ‘Advances in natural language processing’, Science, vol. 349, no. 6245, pp. 261-266.

Holmgren, AJ, Patel, V & Adler-Milstein, J 2017, ‘Progress in interoperability: measuring US hospitals’ engagement in sharing patient data’, Health Affairs, vol. 36, no. 10, pp. 1820-1827.

Kourou, K, Exarchos, TP, Exarchos, KP, Karamouzis, MV & Fotiadis, DI 2015, ‘Machine learning applications in cancer prognosis and prediction’, Computational and Structural Biotechnology Journal, vol. 13, pp. 8-17.

Lewis, MJM, Kohtz, C, Emmerling, S, Fisher, M & Mcgarvey, J 2018, ‘Pain control and nonpharmacologic interventions’, Nursing 2018, vol. 48, no. 9, pp. 65-68.

Ni, L, Lu, C, Liu, N & Liu, J 2017, ‘Towards a smart primary care chatbot application’, In International Symposium on Knowledge and Systems Sciences, Singapore, pp. 38-52.

Pennic, F 2019, Global chatbots in healthcare market to reach $498m by 2029. Web.

Raghupathi, W & Raghupathi, V 2014, ‘Big data analytics in healthcare: promise and potential’, Health information science and systems, vol. 2, no. 1, p. 3.

Reinhardt, UE 2014, ‘Health care price transparency and economic theory’, JAMA, vol. 312, no. 16, pp. 1642-1643.

Sheller, MJ, Reina, GA, Edwards, B, Martin, J & Bakas, S 2018, Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. In International MICCAI Brainlesion Workshop (pp. 92-104). Springer, Cham.

Sinaiko, AD & Rosenthal, MB 2016, ‘Examining a health care price transparency tool: who uses it, and how they shop for care’, Health Affairs, vol. 35, no. 4, pp. 662-670.

Tang, C, Plasek, JM & Bates, DW 2018, ‘Rethinking data sharing at the dawn of a health data economy: a viewpoint’, Journal of medical Internet research, vol. 20, no. 11, p. e11519.

Tirinato, JA et al. 2015, Medical data acquisition and patient management system and method, U.S. Patent 9,015,055.

Velupillai, S, Mowery, D, South, BR, Kvist, M & Dalianis, H 2015, ‘Recent advances in clinical natural language processing in support of semantic analysis’, Yearbook of Medical Informatics, vol. 24, no. 1, pp. 183-193.

Walker, T 2018, How predictive modeling, big data analytics curb health costs in China. Web.