OpenSource is Awesome, Get started with ml-jar open source MLOPS Project

Akash Deep
Analytics Vidhya
Published in
6 min readJan 2, 2022

--

Data science open source

In this blog series I will be trying to cover all the aspects of open source i.e what is opensource, how to find open source projects, how we can learn and improve our existing skills, how we can earn, how can we travel across the globe and meet cool, exciting and talented people in your space. The end goal of this series will be taking one live python open source project, understand the complete implementation and at last contribute to the open source if we have some nice ideas and we find a way to optimize the existing process. The above mentioned complete process will not happen over night. It will take minimum 2 to 3 months based on your language(Python, jave etc) proficiency. If you are Cloud and Machine learning beginner, you can also check my other amazing and very simple blogs in this medium account. Now without wasting any time lets try to understand what opensource is according to me, yes you heard it right its my definition of opensource :-)

What is opensource?

Opensource is a virtual and real(both) place where the people from globally with same sets of skillset interacts and share their views on the particular project problem statement. Opensource is also a place where individual can master their current skills very quickly compare to all other source of learning like podcast, youtube, books etc(I am saying this with my experience). This is also a place where you learn industry best practices because most of the open source projects are developed by keeping large userbase(appx millions and billions) in mind.

Pre-requisite for Opensource

There are 3 main pre-requisite to get started with open source:

  1. Individual should know at-least basic of any programming language and should be little versed in googling :-).
  2. Individual should be patience if he/she has basic programming understanding because it requires a patience to understand code and once you understand, slowly and slowly you will be master in this. This process needs minimum 5–6 months of regular hardwork.
  3. Initially Individual should know how to use Github (Github is a code versioning tool). I will be creating a blog on how to get started with Github and the basic commands that we need to remember while working with github.

Where to find opensource projects?

This is one of the important questions that anyone is thinking about now. To find opensource project you can check programs like “Google summer bootcamp”. You can also find the project on the very specific skills that you are good in. Lets say you are good in Python programming language and you are very passionate about Machine learning and artificial intelligence. Now, you want to search for specific open source project where group of peoples are involved in building some AI and ML related solutions using python language. To search this simply you need to write “ AI and ML open source project in python” on google search and you can go through various links of your choice. Just now I have explained you a basic steps how you can search for the open source projects based on your skills and interest.

Challenges you will face while starting with Opensource

One thing we need to keep in our mind(if you had just started opensource) that after doing lots of exploration of projects we need to finalize only one project to work on. Once you find the project of your choice, then your first task will be to go to the Github page of that project. Explore on Github, search what are the open items that is still going on and people are working on, search for some slack channel and teams channel related to that project. I mean to say do all the activity from your side which you can do to get the understanding of the complete project code as soon as possible. This as soon as possible will be minimum 5–6 month if this is your first open source project. We need to be very patience during this time. I am assuring you from my side that once you understand 1 or 2 project completely, you will be stepping towards the master in the open source community(I mean your potential earning will be 40–60 thousand dollar per month) and obviously this requires a dedicated time. It also keeps on adding to your knowledge and makes you think more and more.

Contribute to ml-jar Opensource project

Now we have basic idea what is opensource and how to get started. Being a Python and ML Lover I selected a open source project which is trying to automate a complete Data science life cycle along with some explanation. Yes you got it right i am talking about explainable and automotive AI and ML services. By little exploring i got to know one project, name ml-lib supervised is already working on the implementation of the similar ideas sounding like mine.

Our goal will be downloading the complete ml-lib project from the official ml-lib supervised Github repository. The link of the official ml-lib Github repository is : https://github.com/mljar/mljar-supervised/.

Steps to configure project environment in your local

Once you visit the above link try to clone this project in your local storage. Use the below command to clone the Github project:

git clone https://github.com/mljar/mljar-supervised.git

cloning ml-jar project in the local

Once you clone the project you will be able to see the complete project folder structure in your local:

Folder structure of the project

This complete project is written in python. We will try to understand each and every folders and file. From here our main task and challenges will start, as i mentioned earlier we need to exploring all shorts of possible ways to understand and configure the project. While exploring this I got to know ml-jar also maintain one slack channel. PFB the slack channel link:

From the next blog we will try to understand each and every files and folder structure of this project in detail and see how to work with Github. Please let me know if you have any questions related to open source, I will be happy to help you. Also please comment below if you want me to explain any topics related to data science, cloud, machine learning. I am also planning to create a live working session on youtube how to approach towards the open source project with basic building blocks to apply to all the projects. I have my youtube chanel where i have explained the devops and aws-cloud concepts. PFB my youtube link:

PFB my other Deep learning SDK usage with step by step:

Stay tuned for the next update on opensource

--

--