Introduction to Business Intelligence Oz Assignments

Introduction to Business Intelligence Oz Assignments

Introduction to Business Intelligence Oz Assignments

Introduction

Research in health and associated policy of privacy s of great importance for the entire society. Research in health plays an important role in improving health care service. In any organization an important asset is the information belong to the various information. Some of this information is extremely confidential and therefore, therefore confidentiality of the information should be protected. Security to information refers to protection of important information and the associated hardware used to protect the information (Gowtham et al. 2017). The term security refers to a composite system including both internal and external operation such that the collected data and information technology remain protected and confidential. Information security has several functions to play. These include protection of data collected by the organization, protection of organization’s capacity in carrying out assigned work and security to technology assets used by the organization. The security to information is important in the sense that misuse to information resulted from weak protection of information has the potential distortionary effect on lives of many people associated with the organization.

Australia Digital Health Agency is responsible for handling all health related information of Australians. In the past few decades, there is an increasing tendency to gather various types of information related to health care (Digitalhealth.gov.au. 2018). Increasing involvement of various people to receive health care service and presence of few group of service providers make is extremely important to document detail information regarding health condition. Technological innovation in machinery and equipment used for diagnosis and different therapies leads to an increase in availability of information related to individual (Zhang et al. 2017). Among various types of available data, the one that is most easily available the data of prenatal testing. The continuity in health care service is also subject to proper documentation of associated risk factors. This can also be considered as a method for protecting against any type of allegation of malpractice. Primary health care offers various services in addition to direct health care. The health care service providers record their observation, impression and instructions. In the system, often third party intervention is observed where they use the information to pay for different health care services.

Significance of the system of electronic record

Today, health care sectors place great importance to protection of privacy of the collected information. The electronically saved health record should be kept confidentially. This is due the fact accessing large volume of health care information today has become very become. It is therefore very easy to misuse the confidential information. The electronic media is used to store health care data, as it is not possible to keep photocopy of each of the document containing medical report. The process of photocopying is no convenient as it is both laborious and time consuming. The second aspect related to health care data is that electronic media based data has now considered as a valuable asset (Zhang et al. 2015). It is observed that, various competing companies collect information from the Physicians and different pharmacies regarding other companies’ sale to increase their competitiveness and earn a higher incentive.

Data gathered from various sources are first combined and then are connected to others’ profile. It becomes easier to extract data from remote location and explore the database within the build network when data are saved electronically. This however possess a high risk of third party access to the information. In the absence of proper security system to protect the organization database, it is easy for anyone to access the data without leaving any trace of such mishap (Legislation.gov.au 2018). The service providers can easily understand trend of the data series containing information regarding health condition of the whole population. The electronic record system provides information about the kind of medical treatment used by most of the people. The providers then can use the information in designing health care service in favor of most frequently used medical care service (Legislation.gov.au. 2018). This information is useful in undertaking preventive measures. The most recent technological advancement in the medical care industry is impairment of electronic health care record with mobile technology (Hasanain, Vallmuur and Clark 2015). The HER is also playing an important role in the industry. Continuous evolution of the system along with improvement in the overall quality of health care service is an important contribution of the system. The electronic based system has an advantage over the paper based terms in terms of saving time and efforts of the service providers. This in turn contributes to an increasing flow of service providers in recent years.

As the information related to status of health and financial management related to individual patients are stored electronically then errors in processing the information is mitigated. Additionally, this also helps to avoid risk of malpractices in meeting reimbursement claim. A major drawback of current legal framework for confidentiality is that there is no variation in obligation related to health care record regarding whether it is stored electronically or is subject paper based method. The degree of confidentially however vary depending on information holder and the nature of information kept.

Evolution inElectronicallystored health record

With increasing burden of health record, the system of electronically based has undergone rapid technological changes to offer targeted support to the service providers. Large variation in a nation creates a room for huge support from utilization of electronic health record. The market of electronic health record thus undergoes severe changes in recent years depending on the proposition of concerned authorities. In most hospitals, the electronic health record system plays an important role (Banfield et al., 2017). Therefore, with change in the structure of health record, there is an associated change in outcome of patients as well. As per the findings of National Center of Health Statistics, approximately 75 percent service providers are able to enhance the quality of patient care service by using the electronically stored database. The function of electronic health record system is not limited to storage of necessary data and information. It also provides necessary information that can be used for benefits of the patients. The system contains records regarding when patients take new medication and alerts physicians about the potential issues. Digital health technology has undergone serious changes in recent years influencing lives of many people.

The Australian Digital Agency gathers and protects personal information of citizens. The privacy principles are protected by the organization following the Privacy Act of 1988. The operators of Health Record System manage the personal information of the organization. Personal information is considered as an opinion where an individual is identified for communication and arrangement of business meetings and obligations (Myhealthrecord.gov.au. 2018). The operators maintain ‘My Health Record System’. Information are gathered about recipients of health care services. Various other records are reported as per operators of registered repository (Hsiao, Hing and Ashman, 2014).. The legislation of a state or territory prevents the data from third party sharing. The regulation imposes civil and criminal penalties on a person subject to accessing, utilizing and information disclosure without valid authorization.

Information in the organization is gathered through the medium of telephone, mail, and facsimile and from general people and other health care operators. Data are collected on job title, records of employees, individual image, bank details and work history. The registered information in the healthcare organization helps to identify healthcare service for individual. The first aid received by an individual is administered in the Agency premises. The agency then collects the relevant information and discloses the personal information in order to manage relation with its employees and duty of the organization. There often arise some critical situations such as conducting deals with the business associates and other contractors, management of workforce, meeting legislative obligation, delivery of functions, marketing information associated with goods, services, initiatives or events or offering communication information. After receiving marketing materials from the organization, the individual might go for further communication regarding the same. Disclosure of the information to the Human Service Department provide further support to the health care organization (Myhealthrecord.gov.au 2018). As the individual appeals to get registered in digital health care service through the process of Document Verification Service the agency then in position to reveal personal information of the applicant. It is not possible to maintain security of the database gathered through internet. Relevant information are also collected following IP address of the computer. The cookies and cache files are stored in the agency’s website named ‘www.digitalhealth.gov.au.’ The geo-location corresponding to the IP address is used by the agency to determine particular location. Information can also be gathered from download or sign up drop box option.

Common people possesses the right to request an access or correction of current information reported in the Agency. Before providing information access to the individual, the agency confirms identity of the individual. The agency always attempts to make a quick response to any complaints or enquiries of individual participant (Pearce and Bainbridge 2014). The parental responsibility related to healthcare of a recipient be act as an authorized representative is strictly restricted under the age of 18. The action of healthcare recipient depends on the purpose of the act. Such healthcare act influences the service recipient depending on modification of the prescribed regulation.

Task 2

Table 2.1 shows the results of the exploratory data analysis on the variables involved in the dataset “loan.delinq.csv”. This research is mainly based on prediction of the fact whether a person is delinquent to loan or not. In order to evaluate that, several information on the status of the loan taken by the people are recorded. All the information is not necessary to be considered in the prediction model. Some must have significant impact on the evaluation of delinquency and some other variables might not affect the fact. In order to identify the variables that are most closely related to the predictor, the exploratory data analysis has been conducted. All the analysis in this section has been conducted in the software RapidMiner.

In the dataset. The variable “SeriousDlqn2yrs” is the predictor variable. From the analysis, it has been observed that the number of persons that are loan delinquent are 139974. This is denoted by the value “0” and the number of persons that are not delinquent to loans are 10026 and this is denoted by the value “1”. Thus, this variable is a dichotomous variables involving only two values “0” and “1”. This is the predictor variable, that is, the variable that will be predicted using statistical models. The second variable is the “RevolvingUtilizationOfUnsecuredLines”, which is a continuous variable. The variable indicates the total balance on the credit cards and on the personal lines of credit. These values do not include the real estate or any kinds on debts that are to be paid in installments such as car loans. Thus, the average balance on the credit cards of the people have been obtained as $5.75 with the balance ranging between $0 to $50,708. This indicates that most people do not have much credit card balance but some people have extremely high credit card balances due. The next considered variable is the age of the people who have borrowed money. Age of the people is also a continuous variable and the average age of the people borrowing money has been obtained as 52.29 years when the range of the age of the borrowers have been found to be between 0 years and 109 years. The variable that has been considered next is “NumberOfTime30 – 59DayaPastDueNotWorse”. This variable indicates the number of times a borrower has passed the due date by 30 – 59 days but not more than that in the last 2 years. The range of values for this variable has been found to be 0 – 98 times. The average number of times a person misses this due date has been obtained to be 0.421. “DebtRatio” is the next variable that has been considered here. This variable indicates the payment of the debts monthly along with the alimony and the living costs divided by the gross monthly income of the borrower. The debt ratio ranges between 0 to 329664 and with an average ratio of 353.005. Most of the borrowers have been observed to have a monthly income of around $5,000. The variable “AverageNumberOfOpenCreditLinesAndLoans” indicate the number of open loans such as the EMI’s of car loan, real estate loan or even the credit card loans. The average of the number of open loans has been found to be 8.45 with a range between 0 and 58. The variable “NumberOfTimes90DaysLate” indicates the number of times a borrower is past the due date. The number of times vary between 0 and 98 times. The average number of times being 90 days late has been observed to be 0.27. Number of mortgage and real estate loans including home equity lines of credit denoted by the variable “NumberRealEstateLoansOrLines” have been found to vary between 0 and 54 tomes with an average of 0.11 times. The variable that has been considered next is “NumberOfTime60 – 89DayaPastDueNotWorse”. This variable indicates the number of times a borrower has passed the due date by 60 – 89 days but not more than that in the last 2 years. The range of values for this variable has been found to be 0 – 98 times. The average number of times a person misses this due date has been obtained to be 0.24. From the analysis, it has also been observed that most of the borrowers stay alone and there are no dependents in their families.

The graphs showing the nature of the variables along with the exploratory data analysis results are given in the following figures and tables.

Table 2.1: Results of Exploratory Data Analysis for loan.delinq.csv

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

From all the analysis conducted and discussed above, it is not possible to identify which variables will be most appropriate in predicting the loan delinquent borrowers. Thus, in order to identify that, correlational analysis has been performed. From the results of the correlation analysis provided in figure 12, it can be seen clearly that the variables that are most closely related to the predictor variable both positively and negatively are age, NumberOfTime30-59DaysPastDueNotWorse, NumberOfOpenCreditLinesAndLoans, NumberOfTimes90DaysLate and NumberOfTime60-89DaysPastDueNotWorse.

Figure 12: Correlation Matrix

Task 2.2

The model that has been considered for the prediction of loan delinquent borrowers at first is the decision tree model. The following figures 13, 14 and 15 denote the decision tree process, the decision tree rules and the decision tree respectively. The tree indicates the checking of the conditions from the top to the bottom in order to predict whether the person is loan delinquent or not. At first, it is checked whether a person has past the due date more than 90 days more than 2.5 times. If it is less than 2.5 days, then it is checked whether the number of times is less than or greater than 1.5 times. If it is less than 1.5 times, it is checked whether the number of times the due date was passed between 60 and 89 days, if this number is also less than 1.5 times, then the person is not considered as a loan delinquent person. Now, if the person has past the due date more than 90 days more than 2.5 times, then the age of the person is considered. If the age is greater than 22.5 years, the person will be considered as loan delinquent but if the age is less than 2.5 years, the person will not be considered as loan delinquent.

Figure 13: Decision Tree Process

Figure 14: Decision Tree Rules

Figure 15: Decision Tree Model

Task 2.3

The second model that has been considered for the purpose of the prediction of loan delinquency is the logistic regression model. With the help of the logistic regression business model the probability of a person to be loan delinquent or not is usually evaluated. The results of the logistic regression model is provided in figure 18. The process ae described in tables 16 and 17. The odds ratio indicates the odds in favor of the prediction. Thus, an odds ratio of 0 indicates that age does not have any impact in the prediction. An odds ratio of 0.2196 indicates that the number of times a person is 30 – 59 days past due in the last 2 years enhances the probability of being a loan delinquent by 0.2196 times. Again, an odds ratio of 0.1809 indicates that the number of times a person is 60 – 89 days past due in the last 2 years enhances the probability of being a loan delinquent by 0.1809 times. Similarly, an odds ratio of 3.3515 indicates that the number of open credit lines and loans enhances the probability of being a loan delinquent by 3.3515 times and an odds ratio of 0.1.392 indicates that the number of real estate loans enhances the probability of being a loan delinquent by 1.392 times.

Figure 16: Logistic Regression Model Process

Figure 17: Logistic Regression Model Process (continued)

Figure 18: Logistic Regression Coefficients and Odds Ratio

Task 2.4

Figure 19: AUC Curve of Decision Tree Model