Data Analytics #06
Assessing Feasibility of Analytical Solution
Hello!!
hope you enjoyed our previous newsletter of “Data Insight to Decision” and how it helps in future of predictive data analytics. Where we have discussed how we are converting a business problem into an analytics solution with case study: Let’s continue to our previous discussion with new topic “Assessing Feasibility of Analytical Solution”.
2.2 Assessing Feasibility of Analytical Solution:
Once the set of candidate analytical solution that address a business problem defined, the next task is to evaluate the feasibility of each solution. This involves considering the following questions:
whether the data needed for the solution exists within the organization or can be sourced externally?
Consider whether data collection mechanisms (e.g., sensors, APIs, or manual entry) can be implemented. Also, evaluate the time, cost, and legal or ethical considerations of data acquisition.
whether the organization has the infrastructure, processes, and skills necessary to integrate and operationalize the insights from the solution?
Let’s discuss first issue affecting the feasibility of an analytics solution. In general, evaluating the feasibility of an analytics solution in terms of its data requirements involves aligning the following issues with the requirements of analytics solution:
The key objects in the company’s data model and the data available regarding them. For example, in a bricks-and-mortar retail scenario, the key objects are likely to be customers, products, sales, suppliers, stores, and staff. In an insurance scenario, the key objects are likely to be policyholders, policies, claims, policy applications, investigations, brokers, members, investigators, and payments.
The connections that exist between key objects in the data model. For example, in a banking scenario is it possible to connect the multiple accounts that a single customer might own? Similarly, in an insurance scenario is it possible to connect the information from a policy application with the details (e.g., claims, payments, etc.) of the resulting policy itself?
The granularity of the data that the business has available. In a bricks-and-mortar retail scenario, data on sales might only be stored as a total number of sales per product type per day, rather than as individual items sold to individual customers.
The volume of data involved. The amount of data that is available to an analytics project is important because (a) some modern datasets are so large that they can challenge even the most advance machine learning tools; and (b) conversely, very small datasets can limit our ability to evaluate the expected performance of a model after deployment.
The time horizon for which data is available. It is important that the data available covers the period required for the analytics solution. For example, in an online gaming scenario, it might be possible to find out every customer’s account balance today but utterly impossible to find out what their balance was last month, or even yesterday.
The second and third issue affecting feasibility of analytics solution are:
When choosing a predictive analytics solution, it’s important to consider if the business can easily use the insights it provides without making major changes to its current processes. If a solution requires big changes, the business may not be ready for it, no matter how good the solution is.
The analytics practitioner first evaluates which solutions are realistic based on available data and the business’s ability to use them. After narrowing down the options, the business and the practitioner set clear goals for success, like how accurate the model needs to be or how it will impact the business, before moving forward with implementation.
Case Study: Motor Insurance Fraud
Returning to the motor insurance fraud detection case study, below we evaluate the feasibility of each proposed analytics solution in terms of data and business capacity requirements.
[Claim prediction] Data Requirements: This solution would require that a large collection of historical claims marked as fraudulent and non-fraudulent exist. Similarly, the details of each claim, the related policy, and the related claimant would need to be avail able. Capacity Requirements: Given that the insurance company already has a claims investigation team, the main requirements would be that a mechanism could be put in place to inform claims investigators that some claims were prioritized above others. This would also require that information about claims become available in a suitably timely manner so that the claims investigation process would not be delayed by the model.
[Member prediction] Data Requirements: This solution would not only require that a large collection of claims labeled as either fraudulent or non-fraudulent exist with all relevant details, but also that all claims and policies can be connected to an identifiable member. It would also require that any changes to a policy are recorded and available historically. Capacity Requirements: This solution first assumes that it is possible to run a process every quarter that performs an analysis of the behavior of each customer. More challenging, there is the assumption that the company has the capacity to contact membersbased onthis analysis and can design a way to discuss this issue with customers highlighted as likely to commit fraud without damaging the customer relationship so badly as to lose the customer. Finally, there are possibly legal restrictions associated with making this kind of contact.
[Application prediction] Data Requirements: Again, a historical collection of claims marked as fraudulent or non-fraudulent along with all relevant details would be required. It would also be necessary to be able to connect these claims back to the policies to which they belong and to the application details provided when the member first applied. It is likely that the data required for this solution would stretch back over many years, as the time between making a policy application and making a claim could cover decades. Capacity Requirements: The challenge in this case would be to integrate the automated application assessment process into whatever application approval process currently exists within the company.
[Payment prediction] Data Requirements: This solution would require the full details of policies and claims as well as data on the original amount specified in a claim and the amount ultimately paid out. Capacity Requirements: Again, this solution assumes that the company has the potential to run this model in a timely fashion whenever new claims rise and also has the capacity to make offers to claimants.
This assumes the existence of a customer contact center or something similar. For the purposes of the case study, we assume that after the feasibility review, it was decided to proceed with the claim prediction solution, in which a model will be built that can predict the likelihood that an insurance claim is fraudulent.
Thank you for joining us! if you enjoyed this edition, consider giving it a like. We’d love to hear your thoughts-drop a comment below!
In next episode we will continue our topic with “designing the Analytical base table.”




