Recommendation Systems. Simple concept – Powerful applications!

What is Recommendation System?

It’s one of the most popular data since applications. It’s a system that predicts the likelihood that a user would prefer an item, based on his past behaviors. That can be done by employing a machine learning algorithm, which can predict user preferences for a particular entity. There are a wide variety of applications for the recommendation systems, and it is used by many of the big technology companies, in order to recommend products to their customers. For instance, Amazon used the recommendation systems for product recommendations, YouTube for video recommendations, Netflix and IMDB for movie recommendations, and Facebook and Twitter for friend recommendations.

  • The diagram below demonstrates the recommender systems method.

Recommendation System Mechanism:

The engine of the recommendation system filters the data via different machine learning algorithms, and based on that filtering, it can predict the most relevant entities to be recommended. After studying the previous behaviors of the users, it recommends products/services that the user may be interested in.

The engine’s working of a recommendation is classified in these 3 steps:

1- Data Collection: The techniques that can be used to collect data are:

  • Explicit, where data are provided intentionally as information (e.g. user’s input such as movies rating)
  • Implicit, where data are provided intentionally but gathered from the available data stream (e.g. search history, clicks, order history, etc…)

2- Data Storage: It can be stored in a cloud storage such as SQL database, NoSQL database, or some other kind of object storage. However, it depends on the data type and amount as well. The more data that the storage can have for the model, the better the recommendation system can be.

3- Recommendation System Methods:

There are several methods in recommendation systems, but there are two major approaches to filter data on the system:

  1. Collaborative FilteringIt is making recommendations according to the combination of your experience and the experiences of other people.
  2. Content-Based Filtering (The one that I used in implementing my movie recommendation system)It is based on product attributes, which is the item description and the preferences of users’ profile. It calculates the similarity between different products on the basis of their attributes. It treats recommendation as a user-specific classification problem and learns a classifier for the user’s likes and dislikes based on product features.
  • The diagram below demonstrates content-based filtering recommender systems.

Recommendation System Applications:

There is a wide and variety of applications for recommendation systems, especially in the data science field. For example, music and video companies like Netflix, YouTube, and Spotify use them to generate music and video recommendations. Amazon uses it for product recommendations. Social media platforms such as Facebook and Twitter use them for friends and content recommendations. Restaurants and hotels use it to generate food-related recommendations. As well as in the research articles, financial services, and life insurance.

Implementing Movie Recommendation System in Python

One simple and direct way to develop a movie recommender system is to use the correlation between the attributes of the movie. Thus, it will find the similarities between the movies to make a suitable recommendation for the user. I used here MovieLense data from Kaggle, and I employed a Machine Learning algorithm to filter data using the content-based filtering method, in the purpose of making those evaluations and predictions. I also used the K-nearest neighbor classifier model, which finds the k most similar items to a particular instance based on a given distance metric.

  • The diagram below demonstrates the K-nearest neighbor classifier model.

After doing some Exploratory Data Analysis (EDA), I found out that there are only 6 features in the 2 datasets (merged). Thus, I decided to extract new features from the given ones as much as possible. Also, here are some noticed things from exploring the dataset,

About the dataset:

  • Number of Movies in the Dataset: 10325 movies
  • Number of Users in the Dataset: 668 users

Plot 1:

  • Most of the rated movies are having a rate of 4.0
  • Only 1198 Movies have a rate of 0.5 (lowest rate)

Plot 2:

  • It shows the count of the top 10 genres that the movies in this dataset are categorized.
  • The genre that represents the higher number of movies is Drama

Experiments Results:

After using the K-nearest neighbor classifier as a model to predict the model, its accuracy score was 48.5% and it had beat the baseline’s, by 48.2%.

I tried to implement the model by optimizing it with the GridSearchCV best parameters, but the accuracy did not increase.

Further Recommendations:

Although that I extracted more than 20 features from the 6 ones, there was a shortage of information about the movies and their details! So, I believe that the accuracy score could be better if I had more details related to the movies. (e.g. actors & director)

How YouTube Recommends Videos


Back in 2008, YouTube had passed Yahoo! to become the second largest search engine in the world, behind only Google. Today, we can ask a related question: “Is YouTube about to pass Amazon as the largest scaled and most sophisticated industrial recommendation system in existence?” This question isn’t rhetorical – because we don’t know the answer as YouTube fiercely competes with the Amazon recommendation system.

YouTube suggested videos are a force multiplier for YouTube’s search algorithm that we would need to understand.

Earlier YouTube Recommendation Process

To maximize your presence in YouTube search and suggested videos, you need to make sure your metadata is well-optimized. This includes your video’s title, description, and tags. Most SEOs focus on the search results – because that’s what matters in Google.

How to create metadata tags in YouTube?

We need to look at the relevant top-ranking video and then use as many of the tags as we could that were also relevant for our video.

Recent YouTube Recommendation Behaviour

The scenario with the YouTube Recommendation approach is changed now. To get repeated viewers, the video must be recognized by the YouTube Recommendation Process. But, most YouTube marketers know that appearing in suggested videos can generate almost as many views as appearing in YouTube’s search results.

Why? Because viewers tend to watch multiple videos during sessions that last about 40 minutes, on average. So, a viewer might conduct one search, watch a video, and then go on to watch a suggested video. In other words, you might get two or more videos viewed for each search that’s conducted on YouTube. That’s what makes suggested videos a force multiplier for YouTube’s search algorithm.

How does YouTube Recommend Videos – Lighter Approach

There is a video in YouTube on the YouTube Creators channel entitled “How YouTube’s Suggested Videos Work”.

As the video’s 300-word description explains:

“Suggested Videos are a personalized collection of videos that an individual viewer may be interested in watching next, based on prior activity.”

“Studies of YouTube consumption have shown that viewers tend to watch a lot more when they get recommendations from a variety of channels and suggested videos do just that. Suggested Videos are ranked to maximize engagement for the viewer.”

So, optimizing your metadata still helps, but you also need to create a compelling opening to your videos, maintain and build interest throughout the video, as well as engage your audience by encouraging comments and interacting with your viewers as part of your content.

How YouTube Recommends Videos – Recommender Systems

Recommender Systems are among the most common forms of Machine Learning that users will encounter, whether they’re aware of it or not. It powers curated timelines on Facebook and Twitter, and “suggested videos” on YouTube.

Previously formulated as a matrix factorization problem that attempts to predict a movie’s ratings for a particular user, many are now approaching this problem using Deep Learning; the intuition is that non-linear combinations of features may yield a better prediction than a traditional matrix factorization approach can.

In 2016, Covington, Adams, and Sargin demonstrated the benefits of this approach with “Deep Neural Networks for YouTube Recommendations”, making Google one of the first companies to deploy production-level deep neural networks for recommender systems.

Given that YouTube is the second most visited website in the United States, with over 400 hours of content uploaded per minute, recommending fresh content poses no straightforward task. In their research paper, Covington et al. demonstrate a two-stage information retrieval approach, where one network generates recommendations, and a second network ranks these generated recommendations. This approach is quite thoughtful; since recommending videos can be posed as an extreme multiclass classification problem, having one network to reduce the cardinality of the task from a few million data points into a few hundred data points permits the ranking network to take advantage of more sophisticated features which may have been too minute for the candidate generation model to learn.


There were two main factors behind YouTube’s Deep Learning approach towards Recommender Systems:

  • Scale: Due to the immense sparsity of these matrices, it’s difficult for previous matrix factorization approaches to scale amongst the entire feature space. Additionally, previous matrix factorization approaches have a difficult time handling a combination of categorical and continuous variables.
  • Consistency: Many other product-based teams at Google have switched to deep learning as a general framework for learning problems. Since Google Brain has released TensorFlow, it is sufficiently easy to train, test, and deploy deep neural networks in a distributed fashion.

Network Structure

There are two networks at play:

  • The candidate generation network takes the user’s activity history ****(eg. IDs of videos being watched, search history, and user-level demographics) and outputs a few hundred videos that might broadly apply to the user. The general idea is that this network should optimize for precision; each instance should be highly relevant, even if it requires forgoing some items which may be widely popular but irrelevant.
  • In contrast, the ranking network takes a richer set of features for each video, and score each item from the candidate generation network. For this network, it’s important to have a high recall; it’s okay for some recommendations to not be very relevant as long as you’re not missing the most relevant items*.***

On the whole, this network is trained end-to-end; the training and test set consists of hold-out data. In other words, the network is given a user’s time history until some time t, and the network is asked what they would like to watch at time t+1! The authors believe this was among the best ways to recommend videos provided the episodic nature of videos on YouTube.

Performance Hacks

In both the candidate generation and candidate ranking networks, the authors leverage various tricks to help reduce dimensionality or performance from the model. We discuss these here, as they’re relevant to both models.

First, they trained a subnetwork to transform sparse features (such as video IDs, search tokens, and user IDs) into dense features by learning an embedding for these features. This embedding is learned jointly with the rest of the model parameters via gradient descent.

Secondly, to aid against the exploitation/exploration problem, they feed the age of the training example as a feature. This helps overcome the implicit bias in models which tend to recommend stale content, as a result of the average watch likelihood during training time. At serving time, they simply set the age of the example to be zero to compensate for this factor.

Ranking the Predictions

The fundamental idea behind partitioning the recommender system into two networks is that this provides the ability for the ranking network to examine each video with a finer tooth comb than the candidate generation model was able to.

For example, the candidate generation model may only have access to features such as video embedding, and the number of watches. In contrast, the ranking network can take features such as the thumbnail image and the interest of their peers to provide a much more accurate scoring.

The objective of the ranking network is to maximize the expected watch time for any given recommendation. Covington et al. decided to attempt to maximize watch time over the probability of a click, due to the common “clickbait” titles in videos.

Similar to the candidate generation network, the authors use embedding spaces to map sparse categorical features into dense representations. Any features which relate to multiple items (i.e. searches over multiple video IDs, etc) are averaged before being fed into the network. However, categorical features which depend upon the same underlying feature (i.e. video IDs of the impression, last video ID watched, etc) are shared between these categories to preserve memory and runtime requirements.

As far as continuous features go, they’re normalized in two ways.

  • First, it follows the standard normalization between [0, 1), using a cumulative uniform distribution.
  • Secondly, in addition to the standard normalization x, the form sqrt(x) and  are also fed. This permits the model to create super and sub-linear functions of each feature, which is crucial to improving offline accuracy.

To predict expected watch time, the authors used logistic regression. Clicked impressions were weighed with the observed watch time, whereas negative examples all received unit weight. In practice, this is a modeled probability ET, where E[T] models the expected watch time of the impression, and P models the probability of clicking the video.

Finally, the authors demonstrated the impact of a wider and deeper network on per-user loss. The per-user loss was the total amount of mispredicted watch time, against the total watch time on held-out data. This permits the model to predict something that is a proxy to a good recommendation; rather than predicting a good recommendation itself.


“Deep Neural Networks for YouTube Recommendations” was one of the first papers to highlight the advancements that Deep Learning may provide for Recommender Systems, and appeared in ACM’s 2016 Conference on Recommender Systems. It laid the foundation for many papers afterward. So, it has been a fantastic journey for the YouTube in the past decade to improve the recommendation process which in turn helps to keep the viewers intact. There are statistics that YouTube app in mobiles has replaced watching television to a great extent around the world. Not at all a simple task, we must sincerely appreciate the people behind it to happen.

“We will soon trade in our clunky flat screens for its handheld cousin, the smartphone and its YouTube app.”

Why Edge Computing is gaining popularity

Edge Computing

Edge Computing brings computation and data storage closer to the devices where it’s being gathered, rather than relying on a central location that can be thousands of miles away. This is done so that data, especially real-time data, does not suffer latency issues that can affect an application’s performance. In addition, companies can save money by having the processing done locally, reducing the amount of data that needs to be processed in a centralized or cloud-based location.

Gartner defines edge computing as “a part of a distributed computing topology in which information processing is located close to the edge – where things and people produce or consume that information.”

Ubiquitous Computing

Ubiquitous computing is a concept in software engineering and computer science where computing is made to appear anytime and everywhere. In contrast to desktop computing, ubiquitous computing can occur using any device, in any location, and across any format.

And we are probably seeing this in our own everyday life. For example, at home, we might be using an Alexa device from Amazon or we might be using Google home. We might even have an intelligent fridge or a car we can talk to.

As companies increasingly leverage ubiquitous computing to support multiple types of applications and systems, a massive amount of data is generated for decision making. However, sending all the data to the cloud can result in latency. Edge computing can drive sub-second responses by moving both computing and data closer to the user. This will reduce latency, minimize data threats, and boost bandwidth. Here are some interesting use cases across industries:

Evolution of Computing

To understand Edge Computing, we need to travel back a few decades and see how Computing has evolved in the past 50 years. The below picture provides a quick recap of the evolution of Computing.

How Edge Computing works

Edge computing was developed due to the exponential growth of IoT devices, which connect to the internet for either receiving information from the cloud or delivering data back to the cloud. And many IoT devices generate enormous amounts of data during the course of their operations.

Think about devices that monitor manufacturing equipment on a factory floor or an internet-connected video camera that sends live footage from a remote office. While a single device producing data can transmit it across a network quite easily, problems arise when the number of devices transmitting data at the same time grows. Instead of one video camera transmits live footage, multiply that by hundreds or thousands of devices. Not only will quality suffer due to latency, but the costs in bandwidth can be tremendous.

Edge-computing hardware and services help solve this problem by being a local source of processing and storage for many of these systems. An edge gateway, for example, can process data from an edge device and then send only the relevant data back through the cloud, reducing bandwidth needs. Or it can send data back to the edge device in the case of real-time application needs.

These edge devices can include many different things, such as an IoT sensor, an employee’s notebook computer, their latest smartphone, the security camera, or even the internet-connected microwave oven in the office break room. Edge gateways themselves are considered edge devices within an edge-computing infrastructure.

Why does Edge Computing matter

For many companies, the cost savings alone can be a driver towards deploying an edge-computing architecture. Companies that embraced the cloud for many of their applications may have discovered that the costs in bandwidth were higher than they expected.

Increasingly, though, the biggest benefit of edge computing is the ability to process and store data faster, enabling more efficient real-time applications that are critical to companies. Before edge computing, a smartphone scanning a person’s face for facial recognition would need to run the facial recognition algorithm through a cloud-based service, which would take a lot of time to process. With an edge computing model, the algorithm could run locally on an edge server or gateway, or even on the smartphone itself, given the increasing power of smartphones. Applications such as virtual and augmented reality, self-driving cars, smart cities, even building-automation systems require fast processing and response.

Computing as close as possible to the point of use has always been important for applications requiring low-latency data transmission, very high bandwidth, or powerful local processing capabilities — particularly for machine learning (ML) and other analytics.

Here are some interesting use cases across industries:

Use Case (a) Autonomous vehicles

One of the leading current uses is for autonomous vehicles, which need data from the cloud. If access to the cloud is denied or slowed, they must continue to perform; there is no room for latency. The amount of data produced by all sensors on a vehicle is prodigious and must not only be processed locally, but anything sent up to the cloud must be compressed and transmitted on an as-needed basis to avoid overwhelming available bandwidth and taking precious time. IoT applications in general are important drivers of edge computing because they share a similar profile.

Use Case (b) In-hospital patient monitoring

Healthcare contains several edge opportunities. Currently, monitoring devices (e.g. glucose monitors, health tools, and other sensors) are either not connected, or where they are, large amounts of unprocessed data from devices would need to be stored on a 3rd party cloud. This presents security concerns for healthcare providers.

An edge on the hospital site could process data locally to maintain data privacy. Edge also enables right-time notifications to practitioners of unusual patient trends or behaviours (through analytics/AI), and the creation of 360-degree view patient dashboards for full visibility.

Use Case (c) Remote monitoring of assets in the oil and gas industry

Oil and gas failures can be disastrous. Their assets, therefore need to be carefully monitored.

However, oil and gas plants are often in remote locations. Edge computing enables real-time analytics with processing much closer to the asset, meaning there is less reliance on good quality connectivity to a centralized cloud.

Privacy and Security

However, as is the case with many new technologies, solving one problem can create others. From a security standpoint, data at the edge can be troublesome, especially when it’s being handled by different devices that might not be as secure as a centralized or cloud-based system. As the number of IoT devices grows, it’s imperative that IT understand the potential security issues around these devices, and make sure those systems can be secured. This includes making sure that data is encrypted, and that the correct access-control methods are implemented.

What about 5G

Around the world, carriers are deploying 5G wireless technologies, which promise the benefits of high bandwidth and low latency for applications, enabling companies to go from a garden hose to a firehose with their data bandwidth. Instead of just offering faster speeds and telling companies to continue processing data in the cloud, many carriers are working edge-computing strategies into their 5G deployments to offer faster real-time processing, especially for mobile devices, connected cars, and self-driving cars.

The Future of Edge Computing

Shifting data processing to the edge of the network can help companies take advantage of the growing number of IoT edge devices, improve network speeds, and enhance customer experiences. The scalable nature of edge computing also makes it an ideal solution for fast-growing, agile companies, especially if they are already making use of colocation data centers and cloud infrastructure.

By harnessing the power of edge computing, companies can optimize their networks to provide flexible and reliable service that bolsters their brand and keeps customers happy.

Edge computing offers several advantages over traditional forms of network architecture and will surely continue to play an important role for companies going forward. With more and more internet-connected devices hitting the market, innovative organizations have likely only scratched the surface of what’s possible with edge computing.

Sentiment Analysis is Worthless

Disclaimer: The article does not assume that readers have a data science background and thus excludes and masks any complexities behind sentiment analysis or data science.

Opinion mining has reached its peak with the introduction of tools that facilitates sharing ideas and thoughts with the public. Although subjectivity of opinions affects how factual information is, sentiment analysis plays a huge role in studying a targeted group’s perception of a certain entity or event. To mention a few applications where sentiment analysis shines: Discovering a public event’s reaction, improving the customer satisfaction process, and studying a certain brand’s or an entity’s reputation. However, there’s a huge disconnection between the mentioned valuable applications and sentiment analysis, thus, I will try to connect the dots here and illustrate how sentiment analysis should fulfil business needs. Let’s start with a brief explanation of how sentiment analysis works and then move to satisfy the title’s claim.

Sentiment Analysis

Sentiment analysis as a part of natural language processing is the task of discovering a certain text’s emotional tone that is perceived by readers. It receives a text and outputs how positive, negative, or neutral it is. There are other categories as well that are used for sentiment analysis such as [“Angry”, “Sad”, “Happy”, “Excited”] or [1, 2, 3, 4, 5] similar to a rating that goes from 1 being very negative to 5 that is very positive, and so on. I have chosen to group the techniques in terms of their limitations and end results, which will fall into two groups.



There are many words that we categorize conceptually as negative, positive, or neutral. And that’s the very first trials of sentiment classification in the literature that was born right after the outburst of subjectivity analysis (Detecting whether a text is opinionated or not) in the 1990s where the paper “Recognizing subjective sentences: a computational investigation of narrative text” has given a huge contribution to.

Short Overview

Word-level-based models at their core check whether the text has more positive words/phrases than negative words or vice-versa, and then classifies based on that. I won’t go deeper on how it does that as there are many well-known approaches such as looking at the language morphology of a word, using hand-crafted rules, automated “rules” through machine learning, looking at the semantics of words. But the important point to take is that it only operates at the word level and doesn’t go far with the whole text’s semantics. Now let’s see how that works straightforwardly by only focusing on one category: “Negative” Sentiment.

Figure 2 — Translation: I told the cashier Khalid that I got the wrong order, and he said that he can’t change it, what a bad service!

The example is pretty simple here (“Wrong” & “Bad”) but what if it was negating a positive word like saying “Not Good” or “Not Correct”? Here we move to negation handling (Still word-level) where we check words surrounding a positive/negative word and see if they were negating their positivity/negativity.

Figure 3 — Translation: I told the cashier Khalid that my order is not correct, and he said that he can’t change it. The service is not good at all!

This solves the problem of negation. However, what if we have different examples like this:

Figure 4 — Translation: I got the wrong order but the cashier Khalid has solved my problem immediately
Figure 5 — Translation: The pistachio latte’s taste is too bitter. Couldn’t finish it!!

Word-level-based approaches struggle with these kinds of examples where we have in Figure 4 a negative word that precedes “But” and then the negativity gets canceled by “solved my problem” and turns into a positive text. Figure 5 on the other hand falls into a deeper issue where we have the word in Arabic “مر” that might refer to “Pass” or “Bitter” and it can only be resolved by using Arabic diacritics that not so many people use, or employing an extremely complicated parser. The two problems can be solved through the usage of context and semantics.



Words are never independent in a text, each word can change the meaning or opinion of the whole text. Although some natural language processing tasks can run away from the burden of context inclusion (A deeper dive into the semantics of words and their “interactions”), sentiment analysis cannot.

Time-Line Summary

Many trials in the past used rule-based approaches along with word morphology in order to include some semantics, then a movement towards models that try to create groups of words that are similar and by that, documents/sentences will have multiple topics based on the words mentioned (Topic Modeling) where Latent Dirichlet Allocation in 2003 wins as the strongest contributor. After that, deep learning has taken a long course starting from word-level semantics where the star was Word2Vec by Tomas Mikolov through “Efficient Estimation of Word Representations in Vector Space” paper and then moving towards context-level semantics (Contextualized Embedding), until reaching to Transformers to solve many efficiency and quality issues. The basic idea is that there was a huge past where the byproduct is the introduction of models that cater for the context and semantics of words within documents (There’s a huge amazing work on interpreting gigantic deep learning architectures, so the idea that these models cannot be interpreted is not fully true especially when analyzing the core concept of transformers; Attention)

Onto a quick simple example whereby the model includes a contextual representation of text and can understand that the word “مر” is not “pass” but “bitter”.

Figure 6 — Translation: The pistachio latte’s taste is too bitter. Couldn’t finish it!!

Sentiment Analysis and Business Value Disconnect


When we have millions of documents that could be coming from app store or google play comments for an app, google reviews for a place, complaints about a company, twitter region or hashtags tweets…etc. Applying sentiment analysis and getting 10% positive, 20% neutral, and 70% negative for an app or a Twitter hashtag let’s say, is basically useless due to the loss of connecting it to a certain topic. Knowing that some hashtag is too negative only tells you the what, not the why.

You might say that I’ll just filter the text by a keyword but that keyword was chosen by you, not the data! How many words are you going to account for? Are these words being used by customers? Heavily? The data (reviews, comments, tweets) should drive the process of deciding which aspects, or more elaborately, which collection of hundreds of keywords that you should look for. The key takeaway is that you need to know what the aspects are to know what exactly is so positive or negative about your place, app, Twitter marketing campaign, or generally speaking, your business, and then improve.


We ( have researched this subject in order to solve this problem in a different methodology than what is well-known in the literature due to the following reasons:

  1. Scarce Arabic NLP literature
  2. Arabic NLP datasets are of low quality
  3. Arabic NLP base components-of-the-shelf have low quality
  4. Inherent domain-specificity for well-known algorithmic approaches in terms of practicality and generality

We have released our first Generalized Hybrid Aspect-Sentiment Detection and Tracking model which Figure-7 illustrates only its core capability (The model is integrated within Bloom System that is part of Customer-Success platform)

Figure 7 — Translation: I told the cashier Khalid that I got the wrong order, and he said that he can’t change it, what a bad service!

One more thing to notice is that the sentiment has gone through multiple layers of indexing and statistical calculations in order to be served as a comparable metric to the CSAT Score used in Customer-Success Management. However, the aforementioned does not address the issue!

Deeper Dive !

We have discovered that aspects are also not enough. We want to know a very well fine-grained problem specification of the aspects given in Figure 7. What was bad about customer-service above is “Order Exchange” & “Wrong Order” that should be detected by looking at “cannot change it” (ما اقدر اغير) and “Wrong Order” (طلبي غلط). Hence, through a combination of contextualized modeling and graph theory (our first text representation layer to solve the issue), we are currently researching in fully connecting the dots until reaching the core of the problem where Figure 8 will elaborate:

Figure 8 — Translation: I told the cashier Khalid that I got the wrong order, and he said that he can’t change it, what a bad service!

By that, can now discover:

  1. What the total CSAT Score is for a business
  2. Why the total CSAT Score is as such
  3. How to change the CSAT Score

and automatically generate an actionable well-defined recommendation that fits our Decision-Making Platform.

The Hard Truth about Data Science

One of the life-changing decisions that you must have faced discomforting emotions about; is the career path you have to follow. You must have asked, what will happen if I chose this and it turned out to be not at all your interest, or you might have realized that after a couple of years. In this article, I want to focus on choosing the path of being a data scientist, what the other side of data science that is not very well-known to new joiners is, and what data and data science mean outside the scientific realm.

Dilemma of Choice

With the sudden peak in popularity that Harvard Business Review contributed to in 2012 where they have annotated “Data Science” as the sexiest job of the 21st century, businesses started looking for data scientists to employ (Even when they sometimes don’t need to). Consequently, ambitious students started joining this demand wave by choosing this path.

If you were to look up on Google now “Why should I learn data science”, you will find multiple reasons summarized as such: To become good at problem-solving, having a lucrative career path, or due to the very high market demand. These reasons are too broad, not exclusive to data science, never guaranteed, and there might as well be better alternatives. However, they are being repeated everywhere missing out on one main point, people will never be great at something unless they are fully devoted to it, and people popularizing data science unknowingly mask out some challenges that are necessary to be successful. Hence, the title of this article.

Concealed Side

There’s always a difficult side for any field, let’s elaborate on what kind of predicaments or challenges data scientists might face but are not usually well-known.

Reading, Reading, and Reading

Not so many people enjoy reading every day, some of them are new joiners to data science. Data science is about reading books, academic literature, articles, and so on. To bring great ideas that are truly valuable which can improve your output, you must read a ton of knowledge. Following data scientists on social media platforms, subscribing to research organizations’ email lists (My favorite email list is DeepAI), and always being up-to-date is a must, your eyes must be everywhere. Most of what you think about is a byproduct of knowledge you have been introduced to, so be sure to have an abundance of it.

Furthermore, you have a strong backup when trying to fix/detect programming errors, exceptions are raised, program crashes, the output is clearly wrong,…etc, not so much with “Theoretical Bugs”. These bugs are too good at hiding, and you will never catch them if you were not a dedicated reader, you must understand a great level of the inner workings of what you are aiming to apply. Theoretical Bugs sometimes get detected after days, weeks, months, or never; where the model’s true quality is nowhere near to what has been reported.

Living Under Uncertainty

Imagine working for a whole month on a project, then throw it all away, how would that make you feel? Many people cannot accept failure and never let go. They go into a spiral of bad performance or multiple trials of reviving a machine learning project that is already a lost cause. Data science is uncertain, and it will always be, that’s why it’s distinguished by the word science. Managers as well must understand this uncertainty. To lead a successful data science project that is unique and valuable, you have to accept failure and be the first person who supports the team as failure is not so easy to consume.

To account for the risk of failure (For AI projects), I have briefly summarized some of the points that boost the probability of success or at least mitigate its failure:

  • Switch your data science jargon off and accurately define and communicate the business requirements
  • Heavy research in order to define the algorithmic approaches and model’s quality KPI that are in alignment with business needs (e.g. Based on these references, we’re confident to mark a > 85% accuracy as a KPI for use-case X)
  • Be clear with stakeholders about requirements & KPI’s. Communicate exactly what the quality metric means (Further information in the Communication section).
  • Choose at least 3-5 fallback approaches if the chosen first approach failed and make sure you have your timeline buffered for this.
  • Fail fast, and let go if there’s no hope in achieving a value, or pushing the deadline


You must have heard this phrase before “Explain it like I’m 5”, data science communication is all about this. Translating extreme complexity to minimal simplicity is the hardest-to-improve skill for data scientists, as the better you get, the more complexity you will face, and the harder it will be. To mention a few cases where proper communication (AI-Specific) is a must:

  • Project Initiation: Convincing stakeholders to initiate a project necessitates grasping what the end goal is. You need to simulate how it looks like and attach it, always, to a business value. If your main goal is to directly support a decision-making process in a certain industry for example, when presenting a project, you should focus on simulating a decision-making scenario of which the data science project helps at.
  • Limitations: Limitations are unknown to stakeholders, but very well-studied by data scientists. Limitations must be clarified from the beginning as well as documented by focusing on cannot’s. For example: “The project cannot do X”.
  • Timeline: Project timeline choice should align with its value, and a proper Work Breakdown Structure must be prepared and communicated throughout the project life.
  • Performance Report and Continuous Monitoring: You must have communicated your model’s KPI beforehand, you have to bring examples sometimes, people have different perceptions about numbers. 85% accuracy might sound great for a person, but when introduced with an example, it becomes, for the same person, garbage! (I usually like flipping the quality metric by saying, for example, we will make 15 “mistakes” out of 100 “predictions” instead of saying 85% accuracy). Also, when monitoring the model’s performance in production, mistakes can happen, you always have to be ready to offer a proper defense or a proper retrospection when presented by mistakes. One of the things that are most of the time, unfortunately, not included in a data science curriculum is Interpretability. You need to know why the model has predicted an “Apple” instead of an “Orange”, and here where the conundrum peaks! Some projects are critical, and any prediction has a burden of responsibility, so account for the need for interpretability if the project expects it.

Bright Side

Allow me to coat this field with fascination using my own definitions sacrificing some of the scientific jargon.

“Data” in a Different Dimension

Data is our way to represent the real world around us in a slightly different format than what we’re used to. It is a way to share information with others in a more accurate way, it is a method that allows us to play easily with this information using a machine. It’s a technique to convince others with evidence, it’s a method where we capture moments and occurrences of certain real-life events in this world to be later used. Your five senses are a considered data channels to your brain, as much as you can consider your phone’s camera as its sense of sight, or the microphone will be its sense of sound. Each type of computer will have these channeling mechanisms whereby it can receive different data with different formats. What then? The data will set there without any use. Here comes data science!

“Data Science” in a Different Dimension

Data science is an inter-disciplinary field that uses scientific methods, statistics, mathematics, processes, algorithms, and systems to extract knowledge and insights from many structural and unstructured data

V. Dhar

Let’s throw that away for a bit and go with a simpler overview. We previously mentioned that data is just a representation of the real world; texts, sounds, images, numbers …etc. but this has no value. Data science transforms this representation, into another representation whereby people can relate to, it adds value and more information to what was only vague data flowing around us into things that are easily understood. After that, it affects our decision, it makes us realize things that we didn’t know before, it changes our actions, and might as well be used to give us a prediction of what will happen if that action was changed. Also, it might tell us things that we could not have known unless we learned, or even if we have learned it, it can tell us in a faster and a more evidential way.

Imagine that you spend some amount of money every day, wouldn’t it be useful to see where you spend that money, on a monthly basis, with respect to a certain type of spending. Also, you might have to ask your friend for some amount of money in the next month or reduce how much you spend every day if only you were able to estimate your next month’s budget.

Why Learn Data Science?

“Why Learn Data Science?”, is an interesting question… or… — Questions Alert! — is it? How interesting is it? Why is it interesting? And for whom exactly is it interesting? How many people find that interesting? How many people find it boring? Can I compare how interesting that question is with respect to other questions? But wait? How can I represent the concept “Interesting”? Also, Can I predict the number of people who would be interested in that question this year and in the coming year? Can I predict whether a person would be interested in that question or not before I ask?

Can I — Brainstorming Alert! — answer these questions by just seeing how many people searched for that question on google? Or how many people have clicked on websites that have the answer for that question? Or publish a survey that has related questions with that exact question being answered, and then publish the survey without that question and try to predict whether the person would answer “I am interested in that question” based on his other answers? Or can I just calculate the number of junior data scientists in a region at a certain time?

Data Science will give you the ability to ask questions about anything you see, read, or listen to in your everyday life whether it was as simple as the question above, or as hard as the Large Hadron Collider problem. It will make you capable of thinking about multiple approaches to overcome problems or answer questions. It will change the thought process you follow into an analytical thinker; it will change how you make decisions or receive factual claims from people or assess how truthful the claims are. It will provide you with a logical analytical domain of which you can tell when to accept a claim, reject a claim, or stay neutral.

Data Science is more of a lifestyle, and a philosophy, rather than just a career

Big Data with AI for Business is just compelling

What is Big Data

Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analysed for insights that lead to better decisions and strategic business moves.

Big Data refers to our ability to make sense of the vast amount of data that we generate every single second. In recent times, our world has become increasingly digitized, we produce more data than ever before. The amount of data in the world are simply exploding at the moment.

With the internet, more powerful computing and cheaper data storage helped to use data much better than ever before. Big Data means companies like Google can personalize our search results, Netflix and Amazon can understand our choices as a customer and recommend the right things for us. And we can use Big Data to even analyse the entire social media traffic around the world to spot trends.

Benefits of Big Data with AI

By bringing together big data and AI technology, companies can improve business performance and efficiency by:

  • Anticipating and capitalizing on emerging industry and market trends
  • Analyzing consumer behavior and automating customer segmentation
  • Personalizing and optimizing the performance of digital marketing campaigns
  • Using intelligent decision support systems fuelled by big data, AI, and predictive analytics

Realtime Examples of AI and Big Data in Business

Here are the examples of companies that use AI with Big Data and seen enormous success in their fields.

Case Study (a): Netflix – Big Data and AI

Netflix uses AI and Big Data extensively and achieved great success as an organization. It has over 200 million subscribers around the world.

  • Generate Content: AI with big data helps Netflix in understanding consumers more and more granular level, thereby it helps Netflix to generate ‘content’ that matches the consumers taste to a large extent. Other competitors have a 40% success rate, whereas Netflix enjoys an 80% success rate.
  • Recommend Programmes: Netflix uses AI to recommend new movies and television programmes to consumers. 80% of what the consumers watch is driven by their AI recommendations. Netflix fine-tunes their algorithms in understanding the consumers and provides recommendations to the consumers about their programmes and movies.
  • Auto Generate Thumbnails: Netflix uses AI to auto-generate thumbnails. Consumers spend limited time choosing the films on seeing just the thumbnails for few seconds to minutes. Netflix understood the importance of thumbnails for consumers choosing their favourite programmes. Using Artificial Intelligence, thumbnails are generated dynamically based on the consumers’ interests.
  • Vary Streaming Speed: Netflix uses AI for Predicting the internet based on the consumers’ internet speed. AI algorithms help to scale up or scale down the streaming of movies based on the consumers’ real-time internet bandwidth.
  • Assist Pre-production: Netflix uses AI in pre-production activities. It helps to find location spots to shoot a movie (based on actors availability, actors location, etc)
  • Assist Post-production: Netflix uses AI widely in post-production activities as well. Although editing is manual, quality checks are driven by AI to avoid mistakes in post-production. There were several mistakes that happened due to negligence or lack of time, resources during post-production activities. But with the usage of AI algorithms, Netflix could eradicate these problems to a great extent.

Case Study (b): Disney (Theme Park and Cinemas) – Big Data and AI

Disney uses Big Data and AI to give customers a more magical experience. Disney has always been a tech innovator in both Theme Park and in Cinemas to give the customer a wonderful experience.

  • Magic band: Disney offers magic band to its customers while they enter the theme park. Its kind of fitness watch which helps to open hotel room, allows the customers to pay. It has a GPS tracker in the band, which keeps tracking the customers where there are walking within Disney Theme Park. It is to ensure, where they are going within the park, which rides they are spending time, how much time they spend in restaurants.
  • Better Operational Management: It helps to schedule the workers to manage over crowding at one ride or at a single restaurant with in the park.
  • Better Customer Experience: Better management of crowd, giving proper assistance within the park gives the customer a better experience. They might direct the customers to other rides, other restaurants to avoid delay in one place.
  • Realtime Sentiment Analysis: Disney research team started using AI to understand real-time reactions when people watch in the live show or in the cinema. How they do is they are using ‘Machine Vision’ – AI coupled with a Camera, a night vision Camera looking at the audience. They do Sentiment analysis with the people in the show. Cameras will interpret the facial expressions by looking at how the people are responding to the shows or movies to see if they are sad, scared, having fun, etc. This would in turn help Disney to generate quality content based on the customers for their shows and movies.

Case Study (c): Big Data and AI with Motor Insurance

Motor Insurance providers have started using AI with Big Data to provide a dynamic flexible insurance plan that will suit different customers based on their driving skills, ability and composure at different times.

  • Motor Insurance companies generally determine the premium based on the age of the vehicle. The insurance providers then started to understand the Customer based on how they drive by considering the age factor. This gave the perception a person aged 18 would drive rashly on comparing with a person aged 55 who will show maturity in driving.
  • Tracking Card: Motor Insurance providers started providing a tracking card to insert in the vehicle, which helps them to track and understand about the driving ability of the customer. This helped the provider to understand the customer better.
  • Mobile App: Now replacing the card with the mobile connected with GPS, it just needs the providers to install a mobile app within the customers mobile. This helps the providers to collect information about the customer driving. With the implementation of AI with Big Data, the providers can study the customer to a granular level. It helps the provider to understand how the customer is driving in a highway, during a rainy day, or on a hilly mountain road. Also, the question comes, they are people aged 18 who can drive better than the people with higher age. With the AI algorithms, over a period of time, the providers can understand each individual, how he is driving in the morning or in the late night, during a rainy day or during peak hours. Hence the data with the granular detail of the customer helps the Insurance providers to provide flexibility based on their driving skills not just merely on the age of the vehicle or the age of the customer.


It’s no hype that AI with big data are another set of high five technologies just to boast with for the IT giants. It has been used widely in several sectors and industries starting from big organizations to small business. The implementation of AI with Big Data in every industry has proved a great success and has helped the company business to a great extent. As said in the beginning, the world is exploding with data at the moment. Big Data with AI is really making sense of the huge data with the internet, more powerful computing and cheaper data storage.

Multi-horizon Quantile Time Series Forecasting Model

We are happy to announce our new deep learning multi-horizon time series forecasting model, which is part of our Decision Support Platform is a sales forecasting and recommendation platform that helps companies forecast sales and revenues, and suggests a couple of approaches increase your sales. Business owners can also track and understand customers’ churn and improve customers’ retention, along with offering possible key-actions for specific products sold by the company.

This is a non-technical post where we announce the model and we will submit another technical post that explains the model in detail in the coming days.

The model we have developed outperforms Facebook Prophet model by 73% and Amazon DeepAR by 46% using a normalised quantile loss. Below is a comparison conducted on a dataset for an e-commerce retailer in the UK to forecast daily revenue for the next month.

The above shows how our model visually outperforms other models mainly on how accurate the fit is and the ability to follow the spikes in the series. Below is a combined forecasting plot for all models to easily compare each model’s errors

Below is a Line plot for the whole time-series dataset used in the experiment where the hold-out set (Test set) starts at the beginning of November.
The results of median prediction evaluation show that our model outperforms Prophet by 73% and DeepAR by 46% using a normalised quantile loss.

The model is part of our effort to build a Sales recommendation system named “”. You can visit to know more and sign up for the beta version.

How Artificial Intelligence is Transforming Modern Marketing

Are you struggling to choose the best marketing strategy or measure the effectiveness and adequacy of your marketing campaign? You are not alone I’m too.

I’m no expert in marketing strategies so to set this straight before you go ahead and read the entire article, but I’m an expert in digital transformation and building intelligent systems that can advance your marketing strategy.

Today, most organizations follow a conventional and traditional approach to develop their marketing strategies. It involves a great deal of effort and requires good study of the market and alignment with the cooperate strategy. However, I would argue that these strategies are predominantly based on past experience and little to do with “your data”. It is rare to see organizations employ advanced analytics to build their strategies. Mostly, due to technical complexities or inability to harvest the data.

Data-Driven Organization

You must have seen this title before. Numerous organizations like to put this title in their strategies to indicate the organization puts data first. Although, this is a great direction to take, however, few organizations do manage to perfectly implement. Only those who really understand how to put “Data-First” manage to succeed in building a data-driven organization.

Building a “Data-Driven Organization” is a rather extremely challenging task. It would take the entire organization to achieve it. Many processes need to be redesigned, rules need to be rewritten and business logic needs to be rethought. Equally, the IT infrastructure needs to be ready to help achieve that from building systems to storing and manipulating data.

Data and Marketing

No matter how good and robust your strategy is, it will be extremely fragile if not based on facts and data. Strategy after all is a process; a thoughtful process; you need to collect data about your organization, products, customers, partners in order to tailor the strategy to work best for you.

The data is available in two places. One within your organization’s systems and the other outside your perimeters. The latter is mostly found in open data. Nowadays, social media and global news on the internet represent a big portion of that data. That is why organizations these days use social media monitoring tools to monitor and observe what people are exchanging about them and their brands.

Social Media and Marketing

Companies today are in a race to attract more customers and promote their products to consumers online and most specifically over social media platforms. It has become a practice to analyze what people say over social media platforms to measure the performance of the marketing and communication department. It’s really such a powerful tool and we have seen the impact they present on the social, economic and political life we have today.

Artificial Intelligence and Marketing

Artificial Intelligence was introduced to solve the inability to process a massive amount of data and spot important things like when people are happy or angry about a service or a product we have. Many tools today offer basic to advanced Natural Language Processing to read the unstructured data make sense of it and present insight that could help organizations improve their services.

AI can be used in various marketing scenarios and I will give a shortlist of potential scenarios where AI can be of help

  • Personalized Recommendations: AI can be used to help deliver personalized content and therefore improve the chance customer click or choose a product or service. With proper data planning, you can collect information about your customer preferences (with consent) and display the relevant products and services.
  • Customer Care: Customer care is a big umbrella that covers interacting with customers, receives feedback and process customers’ requests. AI can be used in various touchpoints within the customer journey.
  • Conversational Agents (Basic & Advanced Chatbots): Chatbots and conversational agents are becoming more and more widely accepted due to the high adoption by many organizations.
  • Content & Website Design: Today, there exist many tools that help in content generation and website designs recommendations. Organizations can easily leverage these tools to easily create and publish compelling contents.
  • Advertisement Bidding: AI is used in all advertisement platforms and organizations can use these available features. For example, you can let google ads decide what best work for you! And without the need to understand how bidding strategies work.
  • Understand Buyer Persona: Understand the buyer persona is key. You can use AI to determine the “intent” of the prospect request and then deliver the request to the right team.
  • Audience Targeting: You can use analytics and advanced analytics to determine the right target audience. You can also use AI tools to screen the public data and generate insights that can help you define your target audience.
  • Topic & Title Generations: Perhaps this is one of the most challenging tasks in AI and today we see quite good advancements in this field. You can generate titles and topics that attract more customers.
  • Customer Churns: Identifying the customers churn is important. You can direct certain marketing campaigns or offer discounts for customers likely to churn.
  • Lead Scoring and Health: You can use AI to assign a scoring for each lead to help sellers quality the lead. This helps optimize the quality of leads and the sales team’s ability to utilize the marketing efforts.

Marketing Recommendation Platform

Realizing the importance of digital marketing and the current gab in finding the right tools to help markers achieve better decisions. We at Noura.AI decided to build a platform that helps people working in marketing and companies make decisions with regard to the services and products they provide.

Musihb (مُسْهِب), is an advanced artificial intelligence global media platform that assists organizations in making decisions in the area of Marketing and Customer Success. Mushib collects Millions of NEWS and SOCIAL MEDIA feeds, analyzes them and provides organizations with the insights and decision choices to help optimize customer experience and improve business outcomes.

Unlike other tools, Musihb provides “recommendations” We call them “Parameterized Recommendations”. Where the AI engine determines the best recommendations and then decide the values within these recommendations that fit your organization. For example, available tools identify your negative sentiment, Musihb’s AI engine on the other hand tells you what improves your sentiment and by which percentage you will probably improve when following the recommendations!

Try it for Free!

We believe AI should be available to all. Most of the available tools are very expensive and few organizations can afford to bear the cost.

We provide a ton of features with very affordable subscriptions fee that is suitable for many. You can also try the tool before you commit to any payment. TRY NOW.

Before you invest in Artificial Intelligence WATCH THIS

Are you thinking to invest in artificial intelligence or get into the data science domain? surely, there has been so much fuzz about it in recent years, big companies and small alike are increasingly investing in these technologies, so the obvious question should you invest now? 

In this article, I’m going to shed light on why should you start to consider investing in AI and how should you approach that. Obviously, this article is not meant for everyone but even if you are not in the IT field this article will highlight why business executives should pay attention to this and how it will help them in their digital transformation journeys. 

Alright, so let me begin by attempting to convince you putting your money, time, and effort into this investment. Let’s look at some numbers here

  • In 2015, a survey by Gartner showed only 10% reported that either they use AI or thinking about using it, while data shows that number has risen dramatically in 2019 to 37%
  • In 2019 the market for Artificial Intelligence was value to about $27B with projected growth to 10X by 2027
  • According to AI contribution to GDP in 2030, by region is expected to be 26% for China GDP 14.5% for north America and 12.5% for my home country

I hope this whet your appetite to know more about the investment in AI. For that I will share with you three things I believe essential for any investment considerations and more specifically so in advanced technologies.

Start a Learning Journey

You need to familiarise yourself with data science and advanced analytics. It’s so easy these day to find good courses online both free or paid. The learning is not just for the purpose of being data scientist but rather gives you understanding of the field you are investing in. Another very important topic you need to research is the problems that you think AI would be of great help. You need to envisage how the use of advanced technologies would really solve a real business problem. In other word, you need to be the digital advisor who uses his/her creativity to solve challenging problems. Remember learning is a journey not a destination. So keep on learning, experimenting and exploring new things 

Work in the Field 

If you can afford to work in a startup or international company do so to gain experience and get exposure to the market and access a large network of customers and therefore explore various challenges.

Surely sometimes, it may not be possible to get a job in this filed, However there are other means such as freelancing and open source communities that you can leverage.

It is very crucial to be equipped with both theoretical knowledge and practical applied experience that teach you what works and what does not.

Due Diligence

This step is perhaps discussed a lot and would vary depending on how you approach the investment. So if you are investing your money in a startup then you would want to look for few things.

  • The robustness of the idea, its viability to market, visibility and impact on business and society.
  • You need to look for the Founders’ past history and current competence and skills because after all they will be leading your investment 
  • Founders readiness of vision, clarity, go to the market and operational plans are very critical. It’s very important that you look for business models that offer resilience and flexibility that can also provide diversity rather than relying on one single product or idea because that could be risky
  • look for a startup that has the right team mixture, it’s like a recipe. Every details matter. Building a thriving culture that value customer empathy and have great values is essential for any business success
  • Check for Market tractions and current customers if any. Validate how will the business model attract new customers and most importantly how fast? Again I stress on the business model and its ability to organically grow in market size and consumption

On the other hand if you are investment your time, skills and energy by beginning a startup you then need to ask yourself five questions: 

  • Am I offering a unique value proposition that solve a problem for a large business segment and there is an urgent need at this time? 
  • Do I have the capability to implement this idea, on time, at budget and offer it on timely manner and acceptable price?
  • Am I building an evolving business model that can sustain changes in market and can easily pivote and tranform to different business models?
  • Am I able to build a thriving culture that attract talents, create shared values and goals? and above all, inspire them to make the impossible?
  • Do I have what it takes to attract customers and investors and be the face and the biggest seller of the company? 

These were the three tips I wanted to share with you today. AI is all about R&D so always look for startups that profoundly exert effort into the research and development because the process always involves trial and error and results only come after many many failed experiments.

Entrepreneurial University: How to Drive Private Sector Innovation?

Think with me! How many great research ideas, papers and projects conducted by university Professors and final year Students are now “on the shelf”? How many wasted business opportunities a company has missed by not having an innovation team or department? But wait, from where great ideas come in the first place? 

As someone who worked many years as a Digital Transformation advisor, I say with certainty, business innovation comes mostly from research. In fact, big companies do have enormous R&D teams and they spend billions of dollars on Research alone. An important question would then be how the private sector and particularly startups can follow the same path?

We at for example, work with university professors on research papers that represent “THE CORE” of our work. We firmly believe our success comes from working on the latest research in Data & AI combined with Business Innovation to create next-level products that can compete with technologically advanced offerings in the market. 

I would guess that you have been intrigued by the “Entrepreneurial University” term in the title. Did I get that right? I have always been captivated by the notion of working with universities to create entrepreneurial thinking, collaborate and solve the knowledge paradox between the academic and private worlds. In fact, this model is widely used by developed countries and considered the second source of funding for academic research in the US.

Finding Common Ground

Finding the common ground between academic researchers and private sectors can be difficult and a road full of hardship and that is mainly in my opinion due to the different mindset between sellers in private companies and scientists. Nevertheless, both parties recognize they need each other to reach their goals. So what is the secret to bridging the gap between the two fields?

The secret in my opinion is innovative thinking. Both fields can embrace innovative thinking and adopt a process. This process should serve as a “connector” between the two fields.

Private Companies Viewpoint

Private companies look for profitability and always measured by their ability to make money. Yes, there are other measures companies employ but at the very end, it is how much money they earned in a given period. So for simplicity, let say private companies do view the world from a money angle. Now, to earn that many companies must implement various strategies to better allocate resources and achieve their goals.

Depending on the company strategy and the type of products they make, traditional products are now very hard to sell. In fact, the consumer has become more sophisticated and demand new experience. Companies have no choice but to transform the way they offer business and continue to innovate. To do that, many have sought to establish an innovation department to start to ideate and bring new ideas.

It is unequivocal that basing products on the latest in research would mostly position the companies’ products as leading in the industry assuming proper marketing and sales strategies. Therefore, working with research institutes would bring great opportunity to business. Realizing its significance, private companies can work with researchers and ensure the result is aligned with the company strategy and the final product is consumer-friendly.

Researchers Viewpoint 

Researchers, on the other hand, focus on the quality of research and the outcomes. Although researchers do consider the practicality of the proposed solutions, however, they don’t tend to focus on the sell-ability of the solutions. It is immensely important that researchers are not distracted by the sales or business matters so they are focus on the quality of outcomes.

Having a clear process will certainly help both researchers and private companies collaborate and produce tangible outcomes without compromise from any party.

Process To Adopt 

Having an independent regulatory body that helps shape the regulation and guidelines that govern the relationship and ensure ethics, seemly and proper conducts are in place is of great value. There is an active relationship today between researchers and the private sector, however, this relationship is not really bound by a clear process that leaves no ambiguity. The below is a proposed process; rather than a simple one; that ensure a consistent relationship:

No alt text provided for this image

Influence & Impartiality

Although the idea of Industry Research Funding seems effectively inciting to the development of research. However, there are growing concerns over the fair-mindedness of research, ethics and impartiality of both research topics and researchers. This has inclined countries to develop guidelines and governance models to help ensure the adherence to properly checked procedures to help avoid conflict of interest and keep preserving the lofty goals of scientific research while enabling the private sector to both contribute to research advancement and help bring innovation to business and consumers.

Financial Model

Perhaps, having a financial model that ensures both researchers and research institutes are compensated well is really needed. The financial model will also incentivize the private sector to invest. The cost of establishing an innovation team at the company would be higher compared to offering to compensate a researcher in a university. Furthermore, the industry research fund will help researchers produce more results. Not only that but also, it enables researchers for example access enterprise-level tools and resources. For instance, researchers can access data annotators in a company or hire someone easily via the company purchase department. They can also build appealing UI that help deliver the solution and show its capability in a better and well-presented UI.

Regulations & Guidelines

The need for setting up regulations and guidelines to govern the relationship between the private sector and the academic research institutes is unequivocally important to ensure the sustainability of the relationship and the yielded outcomes that contribute to business innovation and the increase in the research activities.

The key highlights that need to be taking into consideration whilst planning and building such guidelines and governance models need to ensure:

  • Clear guidelines for Intellectual Property and Patents ownership. This also should include any artefacts such as code and datasets
  • Clear guidelines for future development and usage of the research outcomes. This should also include any packaging and repackaging of any solution.
  • Clear guidelines on the licenses scheme and distribution.
  • Clear guidelines on the compensation scheme and governance model to ensure fairness and avoid any abuse.
  • Clear guidelines on procedures to ensure research fairness as well as correctness and preserve the ethics of research conducts.

It’s also worth it that government need to build a framework that helps both the academic and private sectors collaborate without worrying about complex engagement models and fear of preaching any law. That also should include creating a body that oversights the relationship and ensures adherence to the herewith in framework.

I hope you found this article useful and enriching and would be delighted to receive your kind comments and feedback. Also, please do share your experience if you are an academic and had the chance to work with the private sector.