Q2 2022 ML Readiness Survey Report


By Chen Song, Data Scientist | 8/8/2022

Table of Contents

1. Preface
2. Q2-2022 Current ML Insights
3. Introduction to MLR Student Survey Questions
4. Semantic Scoring Methodology
5. ML/AI Future Insights
6. About Us
7. References
8. Contributors

Preface

AI plays a major role in all industries and companies of all sizes are now upgrading or altering their technology stacks to constrain or limit the compounding technical debt they have accrued from legacy software systems. The goal of course is to drive profound economic and social change across the globe to enhance business value and encourage societal confidence with the applied technologies.

Machine Learning (ML), as an optimization process for AI technologies, is vital for providing and achieving proven more efficient, and smarter AI solutions. Embracing machine learning is not optional but required nowadays because adopting machine learning drives a dramatic delta in business results that potentially improves an organization’s bottom line. There are five major machine learning lifecycles, but a sixth now becomes a core component of ML, and that component is business value.

According to Forbes estimation, “The global machine learning market is projected to grow from $7.3B in 2020 to $30.6B in 2024, attaining a compound annual growth rate (CAGR) of 43%”(Columbus, L,2020). (See Figure 1).

Figure 1. Example of the interactive maps developed using GMaps plugins.

A key driver to an organization's ability to adopt machine learning and evaluate machine learning readiness is crucial to determining market positioning, hence, the higher your MLR or Machine Learning Readiness, the more opportunity organizations will have for investments, which further drive valuations and hiring. While business values remain a core component to remain competitive the impact of machine learning propels businesses across all industries. There is a learning curve however

Loxz Digital continues its journey to help organizations, academic institutions, and individuals buttress ML development risks and spot potential improvement opportunities by providing real-time diagnostic insights from an innovative and specific machine learning market perspect

Loxz Digital provides three versions of the machine learning readiness diagnostic assessment, targeting organizations, individual data scientists, MLOPs engineers and academic students who want to embark on a career in Machine Learning. By taking Loxz Digital's machine learning readiness survey-organization version, you can get a scientific diagnostic evaluation regarding the machine learning readiness of your organization. And by taking the machine learning readiness survey-student version, students receive a practical and diagnostic tool for aspiring ML data scientists themselves to self-evaluate their machine learning readiness.

With 12M new data scientists slated for employment by 2025, knowing a student's strengths and weaknesses in the core components of ML is vital for hiring managers to make decisions efficiently in areas of the ML Lifecycle that are conducive for the hiring company. For academic institutions, the survey helps professors evaluate the school’s curriculum and aptitude of candidates for greater efficiencies. If you are interested in learning more details about the survey, please check out our 2022 Q1 machine learning readiness report. You can also find other resources on the Loxz Digital resources page.

Q2-2022 Current Insights

Voice assistants, powered by deep learning algorithms, have been an integral part of many products we use daily. Annual growth for the use of voice assistants has been in double figures from 2019 to 2022, and it is expected to continue growing in the next year.

According to Babich, N, In 2022, up to 95% of voice assistant users use assistants on their phones(Babich, N. (2022, June 30). Virtual assistants trends to watch in 2022). Not only on personal devices but voice assistants are also widely used in public spaces nowadays. For example, many people use Amazon Alexa to check the weather forecast, and many IoT uses voice assistants to remotely control their smart homes and cars. There is no doubt that the usage of voice assistants will keep the growth momentum in the next decade in both depth and scope. Voice shopping services started to shine in recent years. According to research by Voicebot Ai, there are 45.2 million U.S. adults had used voice to shop for a product at least once in 2021, reflecting 120% growth and a 30% compound annual growth rate (CAGR)( Kinsella, B. (2021, December 24)). As shown in Figure 2, the usage of voice assistants in 2023 will achieve 8 million, increasing 1.46% compared to 2019

Figure 2. Voice Assistant Usage Trend from 2019 to 2023.

The number of AI-driven new unicorn companies showed rapid growth during 2021, and there are 10+ new AI-driven unicorns born in Q1 2022.

The digital transformation is largely driven by artificial intelligence, and according to Microsoft’s SEO, Satya Nadella, the worldwide pandemic accelerated the growth in digital transformation. As a product of digital transformation, AI-driven companies gained rapid growth in the past year. And because the benefits that AI serves organizations are evident, more startups are actively adopting AI and machine learning to improve product innovation and increase competitiveness. According to the data gathered from CB INSIGHTS, there are 24 new AI-driven unicorns born in Q2 2021, achieving the highest during 2018 and 2022. And in the past quarter of 2022, there are still 14 new AI-driving unicorns born. And we believe the trend will continue in 2023 because AI is moving into new areas and AI-driven personalization is in full swing. According to one of the McKinsey survey research(McKinsey & Company), the proliferation of AI is creating the highest value for business operations because of its ability of personalization.

In the month of June of 2022, there were 64 AI/ML related companies that saw an average seed round of 5.5M. While the economy on the surface is slowing due to higher interest rates and inflation, the AI/ML industry has yet to see a slowdown. Investments in the areas of predictive analytics and realtime ML are still in a very nascent stage, and MLOPs an industry for monitoring models has recently found traction. Monitoring models is one of the core lifecycle components of our MLR diagnostic engine, and one question asks organizations if they have or intend to monitor models that they have in production. Monitoring models that are in production is a separate revenue generating stream.

Figure 3. Number of new AI-driven companies from 2018 to 2022
Trend 1: Global eLearning has been evolving, and advanced analytics technologies have the potential to boost eLearning efficiency.

Gamification, Individualized Learning, and virtual reality are the 3 of the most important components of eLearning. And there is no doubt that advanced analytics technologies are required to make the 3 components reliable and trustworthy. According to Andrea Laura from the eLearning Industry, “2022 Is The Year Of Evolving eLearning Trends”(Laura, A. (2022, June 20)). With the blossoming of eLearning apps, such as Undemy, TedEd, and Duolingo, etc., eLearning gains additional attention and popularity. According to the Research Institute of America, eLearning increases retention rates by 25 to 60 percent. However, the pain points come along with the advantages of the eLearning platforms at the same time. How to provide more personalized learning experiences and content to help students grow their knowledge still has a lot of development potential. Advanced analytics technologies have the ability to ease these pain points. One of the Mickensey reports also states that “Advanced analytics—which uses the power of algorithms such as gradient boosting and random forest—may also help institutions address inadvertent biases in their existing methods ”. Loxz Digital also utilizes advanced analytics algorithms to analyze the semantics of the survey answers from students so that we provide an assessment tool for academic institutes to have a better understanding of students’ ML aptitude.

Trend 2: Democratization of Data Sets

In the last quarter or so, we’ve seen an exponential increase in the democratization of datasets. While many of these datasets are feverishly mislabeled, it allows practitioners to seek out tools that make data preparation more efficient. In other situations where you need to obtain the custom dataset, then that part of the process is time consuming and extremely important for the accuracy of your results. And this is why there is tooling coming up that really focuses on that part of the process, and also on the approach of focusing on the data itself. This is what is called the data-centric ML. Introducing new datasets to models is not a new approach, it’s introducing the proper data set with the proper labels and using efficient tools is now in vogue.

Trend 3: Data Embedded in Every Decision and Interaction of a Workflow

Naturally leveraging data to support “smart workflows” in the next real wave of ML. Predictions will be served in milliseconds to help campaign engineers recalibrate deployments prior to launch.

Rather than defaulting to solving problems by developing lengthy—sometimes multi day—road maps, campaign engineers are empowered to ask how innovative campaigns using real time data techniques that resolve challenges in milliseconds. Think the A/B Test. The capability to make better decisions in realtime, automating basic day-to-day campaign activities design approvals and regularly occurring decisions during the editing process. This realtime data-driven culture fosters continuous performance improvement to create a truly differentiated optimization and enables the growth of sophisticated new applications that aren’t widely available today such as smart workflows for campaign engineers.

Data Embedded in Every Decision and Interaction of a Workflow

Ⅰ. Introduction to MLR Student Survey Questions

The Loxz digital machine learning readiness survey-student version was officially published in Q1 2022. The student version survey was designed and published for individual students, academic institutions, and potential employers. Domain experts customized the survey questions for students and the scoring methodology was developed to assess the potency of a student's career trajectory and curriculum settings.

One of the major differences in Q2-2022 report between the organization version survey and the student survey is that the targeted ML aptitude and career trajectory questions which were recently integrated into the student survey. ML aptitude questions are designed to quantify students’ educational foundation and progression towards a successful career in machine learning. Career trajectory questions focus on quantifying a student's propensity and exposure to succeed in specific characteristics of Machine Learning. The ML aptitude section, along with the career trajectory section provides granular information regarding students’ interests and their core ability to excel in machine learning.

In the ML aptitude section, general machine learning technical questions are asked and we believe that assessing answers to general machine learning questions from individual students is able to provide a high-level assessment of a student's ML aptitude. We’ve implemented our own internal semantic scoring model to assess a student's correct answer against an answer corpus. One of the single-option questions in this section asks “How do we perform Bayesian classification when some features are missing?”. Bayesian classification is built on the Bayes' theorem, and Bayes’ theorem is widely used in statistics and machine learning. Currently, there are 5 questions in the ML aptitude section, and they build on different theories and techniques in machine learning. The application of machine learning is widely used in multiple industries nowadays, and industries are benefiting from machine learning. Applied questions are also included in the ML aptitude, aiming to assess the ability of a student to apply the machine learning questions to practical applied real-life problems. For example, one of the ML aptitude questions asks “The robotic arm will be able to paint every corner in the automotive parts while minimizing the quantity of paint wasted in the process. Which learning technique is used in this problem?”. This type of question combines machine learning questions with real-life cases, allowing us to assess the ability of a student to understand the machine learning questions in real-life cases. Deep learning-related questions are also included in the ML aptitude section because we believe that deep learning, as a subfield of machine learning, has limitless applications. Therefore, providing a way to assess ML aptitude in deep learning helps both individual students and academic institutions gain a better understanding of ML aptitude performance. For example, one of the questions asks survey takers to evaluate the differences between ML use case scenarios and deep learning use case scenarios.

In the career trajectory section, career-goal-related questions and current curriculum-related questions were introduced to capture insights into both students’ career trajectories and academic curriculum plans. By and large, these questions are not being asked by academic institutions to students who majored in STEM or Computer Science. By asking questions such as “What have you learned/are you planning to learn for a career in machine learning?”, we are able to uncover insights into the career trajectory of students. In addition, questions such as “Has your academic institution considered doing the following things for your career development?” provides a direct way to assess how well the academic institutions prepare students for a machine learning career. Currently, there are 6 questions in this section in total.

Semantic Scoring Methodology

In order to have a better understanding of students’ ML aptitude, only asking for multiple choice questions did not provide enough specificity for evaluating the aptitude in some prerequisite/foundations of ML. Providing students with the most accurate scoring provides an additional layer of certainty to the subscore of Career Trajectory. Adding text-input open-ended questions will provide additional insight to students for this purpose and answers could be used to assess the potency of a student's career trajectory. These insights are found in the MLR Dashboard. Open-Ended questions/answers give us a deeper perspective of students' overall performance in this sub-scoring category.

The semantic scoring methodology was developed to accurately evaluate students’ answers from the ML Aptitude sub-score of our machine learning readiness diagnostic assessment. Loxz required additional granularity for the ML Aptitude portion of the diagnostic assessment. We applied BERT tokenizer to tokenize the student's answer and compare it with the true answer, then feed those tokens into a pre-trained BERT model. Once the model is built and deployed we capture the dense vector embeddings from the last hidden state. Then we mean pool only the masked embeddings to get the vector representations of answers for calculating the cosine similarity score between the two. The final similarity score will be averaged among all student-corpus answer pairs and normalized to 0 to 100. We published a full Semantic Scoring Methodology report in May 2022, and you can learn all the experiment details and techniques that were applied in the semantic scoring in the paper.

ML/AI Future Insights

AI/ML is revolutionizing every fabric and every industry by integrating these technologies into critical markets and more granular workflows on a global scale. AI is becoming a must-have for most businesses, and organizations across the industries are adopting AI/ML into general and specific workflows.

Machine learning workflows define which phases are implemented during a machine learning project and workflow orchestration in the data and machine learning space today has decidedly increased the productivity of data teams across industries at companies large and small. Efficient infrastructure is necessary to get the ideal outcome in ML workloads. A distributed, scalable, and adaptive orchestrating tool is required when you are running complex business logic. For example, Shopify, an e-commerce giant has scaled dramatically over the past two years. Their engineering team states that their largest environment averages over 400 tasks running at a given moment and over 150,000 runs executed per day. These tasks are part of an overall ML project within the data-intensive workflow. Our Machine Learning Technology Readiness Levels offers a practical framework that is defined as a principled process to ensure robust, reliable, and responsible applications of Machine Learning while being streamlined for future ML workflows, including key distinctions from traditional software engineering.

To expand on the Shopify example as an ML Trailblazer, Loxz Digital’s machine learning survey-organization survey provides a way to help you recognize your organization's machine learning capabilities in 4 tiers: Leader, Innovator, Performer, or Observer. “AI/ML leaders” excel in multiple ways regarding machine learning lifecycles, and you can go to Loxz Digital resource page and find the insights in our previous MLR reports. According to PwC’s fourth annual AI business survey, “What sets these companies apart, the data indicates, is that instead of focusing first on one goal, then moving to the next, they’re advancing with AI in three areas at once: business transformation, enhanced decision-making and modernized systems and processes.”(PricewaterhouseCoopers. (n.d.). PWC 2022 AI Business Survey). The holistic AI business model decides the future of an organization. Data governance, cloud computing power investment, and AI-driven business models are key priorities in the AI/ML adoption in the feature.

Return on investment is one of the vital factors when an organization decides to adopt AI/ML into business. The truth is the return on investment in AI/ML is proportionate to the time an organization has adopted AI/ML because the time an organization needs to deploy the model affects the costs and therefore the returns on investment. Algorithmia’s survey report in 2020 provides quantitatively detailed data regarding the relationship between deployment time and how long an organization has used machine learning (see Figure 4).

Even though the return on investment in AI/ML is proportionate to the time an organization has adopted AI/ML, the development of advancing MLOps tools and end-to-end ML platforms help speed up the deployment process and companies see more benefits in utilizing MLOps tools. No-Code AI And Machine Learning is not a new concept anymore, and we believe no-code AI and ML are gaining more attention and adoption with the development of ML and AI tools.

Figure 4. Time to deploy models vs. how long an organization has used ML

About us

Loxz Digital Group is a Machine Learning Collective located in Berkeley, CA. Established in December of 2020, Loxz is focused on serving RealTime predictive analytics. We supply models and serve predictions within smart workflows to clients of the Email Service Provider network and are in discussions with serving location specific predictions to law enforcement to reduce gunshot violence. We employ a servant-leadership management style where every employee or advisor has a distinct voice.

Specifically, realtimeML is at the bedrock of what we do. Collectively, the current assembled team has over 40 years of ML experience, housing 9 data scientists, all located in the United States and Canada. The data acquired in this report is both first and third party data.

©2022 All Rights Reserved.
Visit www.loxz.com

References

1.Columbus, L. (2020, January 23). Roundup of machine learning forecasts and market estimates, 2020. Forbes. Retrieved July 18, 2022,from https://www.forbes.com/sites/louiscolumbus/2020/01/19/roundup-of-machine-learning-forecasts-and-market-estimates-2020/?sh=528325765c02

2. PricewaterhouseCoopers. (n.d.). PWC 2022 AI Business Survey. PwC. Retrieved July 19, 2022, from https://www.pwc.com/us/en/tech-effect/ai-analytics/ai-business-survey.html?WT.mc_id=CT3-PL300-D M1-TR1-LS4-ND30-PRG7-CN_DataAndAnalyticsBuilds-AISurveyGoogle&gclid=CjwKCAjwrNmWBhA 4EiwAHbjEQIpndOvmZ-yAhVuN0Nwdv81Tgpo9oeo7j-8OjW5jKHDlBs9we6RQRxoCCSQQAvD_BwE& gclsrc=aw.ds

3. 2020 state of ML. Algorithmia. (n.d.). Retrieved July 19, 2022, from
https://algorithmia.com/state-of-ml

4. Mark. (2022, March 6). Global Machine Learning Development Trends and key applications analysis. Mark Ai Code. Retrieved July 31, 2022, from
https://www.markaicode.com/en/global-machine-learning-development-trends-and-key-applications-an alysis/

5. Kapoor, A. (2022, February 8). Top machine learning trends for 2022. Medium. Retrieved July 31, 2022, from https://enlear.academy/top-machine-learning-trends-for-2022-6e7071d37130

6. Babich, N. (2022, June 30). Virtual assistants trends to watch in 2022. Medium. Retrieved August 1, 2022, from https://uxplanet.org/virtual-assistants-trends-to-watch-in-2022-7cf9fd66485b

7. Kinsella, B. (2021, December 24). Voice shopping rises to 45 million U.S. adults in 2021. Voicebot.ai. Retrieved August 1, 2022, from
https://voicebot.ai/2021/12/24/voice-shopping-rises-to-45-million-u-s-adults-in-2021/#:~:text=In%20201 8%2C%20Voicebot%20Research%20found,Shopping%20Consumer%20Adoption%20Report%202021.’

8. The state of AI: CB insights. The State Of AI | CB Insights. (n.d.). Retrieved August 1, 2022, from https://www.cbinsights.com/research-state-of-artificial-intelligence?utm_campaign=Reports&camp aignid=17130804447&adgroupid=137057131675&utm_term=2022+ai&utm_source=goog le&utm_medium=cpc&utm_content=adwords-reports-quarterly-annual-reports&hsa_tgt=k wd-1639454560006&hsa_grp=137057131675&hsa_src=g&hsa_net=adwords&hsa_ mt=p&hsa_ver=3&hsa_ad=598025341080&hsa_acc=5728918340&hsa_kw=2022+ai &hsa_cam=17130804447&gclid=Cj0KCQjw852XBhC6ARIsAJsFPN2R5HuSgUP2UESEz0raEbI wt9EO6N3GtsGPmhIDn8ZJqLmYd8G6-zoaAntIEALw_wcB

9. McKinsey & Company. (2020, January 16). AI adoption advances, but foundational barriers remain. McKinsey & Company. Retrieved August 1, 2022, from
https://www.mckinsey.com/featured-insights/artificial-intelligence/ai-adoption-advances-but-foundation al-barriers-remain

10. Covid-19 and student learning in the United States ... - mckinsey & company. (n.d.). Retrieved August 1, 2022, from
https://www.mckinsey.com/~/media/McKinsey/Industries/Public%20and%20Social%20Sector/Our%20I nsights/COVID-19%20and%20student%20learning%20in%20the%20United%20States%20The%20hurt%2 0could%20last%20a%20lifetime/COVID-19-and-student-learning-in-the-United-States-FINAL.pdf

11.Laura, A. (2022, June 20). Top elearning trends influencing the education sector in 2022. eLearning Industry. Retrieved August 1, 2022, from
https://elearningindustry.com/top-elearning-trends-influencing-the-education-sector-in-2022

12. Team, T. A. (n.d.). Machine learning orchestration with airflow: Astronomer. Astronomer. Retrieved August 1, 2022, from https://www.astronomer.io/blog/machine-learning-pipeline-orchestration/

Contributors

Chen Song, Data Scientist
Lead Author, Lead Analyst

Yiming Zhang, Lead Data Scientist
Pearson Correlation Heatmap Analyst & Author

Yumi Koyanagi, Designer
Report Designer