Skip to main content

Sharpen your teeth in Data Analytics (and maybe create a portfolio while you’re at it)


 

As a novice data analyst myself, I truly understand the mind of another beginner rearing to go out in the world and trying to tame a wild data set and analysing the heck out of it. Understandable, it is then, the urge to pick up the same old Kaggle data set that is begging not to be analysed for the umpteenth time, all whilst arriving at the same insights as the 24,637 people that came before you. One must not be utterly taken aback then, when they see the face of their interviewer (seeing the same analysis thirteenth time that day) turn red in anger without application of any sort of conditional formatting.

So, one might naturally enquire into the remedy to such an ailment. I would then point you towards the less traversed yet endlessly fascinating direction of obscure publicly available datasets. From an exhaustive data set of all known passengers of the RMS Titanic to the largest reference data set of the human genome, not only do they make for remarkably interesting candidates for analytics projects, but they also set you apart in the eyes of interviewers. 

The variety of data out there is so diverse, every budding data enthusiast is bound to discover something that piques their analytical interest.


So here are some of my favourites that you can check out:
  1. Data.gov.in: Do your parents boast that they used to shop for a whole week’s worth of groceries all for 15 rupees in 2009? Bust out the last 15 years of Consumer Price index (CPI) data from the Government of India’s official data repository to prove them wrong, once and for all. Developed by the National Informatics Centre (NIC) under the aegis of Ministry of Electronics and Information Technology (aka MEITy), Data.gov.in has data from more than 6,00,000 resources including crime, judiciary, urban and sports. At this data heaven, everyone is sure to find something that would make a worthy addition to their portfolio. Not only this, but you can have access to a wide selection of their APIs as well. Bonus points for no sign ups necessary.

  2. Awesome public datasets (GitHub): From Swiss apartment models to the biggest crowdsourced database of American gut biome, ‘Awesome public datasets’ is a source of admittedly more global, yet no less amusing datasets which one can explore in search for their next project. These datasets were painstakingly collected and tidied from blogs and user responses. Most of them are absolutely free and part of the open source movement. Again, no obligation to sign up!

  3. Sindresorhus’s Awesome Collection (GitHub): This list, my friends! Is the GitHub equivalent to Sir Ravindra Jadeja because of the all-rounder variety of resources it holds. Not only is it home to learning resources ranging from fintech to Generative AI, it also holds free books, public datasets and much more. This list is a one stop shop to learn anything and everything! Do, however, make sure to have blinders on while you visit this page, otherwise you’re guaranteed to be distracted along the way (speaking from personal experience).

  4. Figshare: This one is for all the academically minded folks out there. Figshare has an endless trove of datasets from close to 25 categories ranging from economics to earth sciences. Be it China’s Covid-19 case data from January 2020 or the species of native plants in any state of the US, if you can think of it, this data repository probably has it. With a clean UX, it successfully distinguishes itself from the typical academic website, making it easier for a newbie to find his way around. The good news? You can download up to 20 GB of this data for FREE! (thank me later)

  5. Google Trends: Did you know that mentions of the term “big data” peaked in October 2018? Wanna know why? Then this is my homework for you to find out through Google’s own repository of all things trends and keywords. Alright! I must admit this one isn’t very “obscure” but deserves a mention, nonetheless. A pioneer in “nowcasting”, google trends is the back bone for all sorts of projects to get real time updates, the OECD’s weekly GDP tracker being a good example. Being the good Samaritans they are, they have an extremely helpful section right upfront to teach newbies how to make the most of this data as well.

  6. World Bank Open data: Wonder how the GDP of the nations of the world has changed over the past 30 years? No biggie, the World Bank has you covered! With it’s ‘World Bank Open Data’ initiative, it has made a true wealth of financial and fiscal data available to the masses. This data is available to download in CSV, XML and Excel formats along with access to their own data bank and thematic tables for easy understanding. All at one click of a button!

  7. OECD Data Explorer: In their own words, The Organisation for Economic Cooperation and Development (OECD) (phew) is an international organisation working towards making better policies for better lives. But they’re not all talk, they’ve made available data ranging from Tobacco consumption, Variation in Body weight between nationalities and Wildfires, to one and all. This excellent selection of data can help you analyse everything from levels of alcoholism between states in India to the variation of occurrence of obesity within the country. Truly a great way to spend one’s Saturday, don’t you think? (just kidding)

  8. UCI Machine learning repository: Focused towards machine learning enthusiasts, this is an excellent repository of more than 600 datasets for all the newbies trying to get themselves familiar with ML. This will ensure that you go from being an ML clueless to an ML connoisseur in no time!

So, I hope that equipped with these sources, you will make your portfolio stand out like a kangaroo in a penguin enclosure. Do always remember the wise words of Franklin D. Roosevelt, “The only thing we have to fear is fear itself, and maybe not backing up important data” (don’t quote me on that though).

Till we meet again, Data comrades!
 
About Author
Author Photo
Vasudev Pandey
I am a budding data scientist and mechatronics engineer with a passion for history and finance. I write about anything and everything I find interesting.

Comments

  1. Thank you for sharing these public datasets with the TakeOff Talent community, Vasudev. This will surely help many people who are confused around how to build a portfolio in analytics as well as in data science.

    ReplyDelete
  2. great article, thank you so much for sharing vasudev

    ReplyDelete
  3. Thanks Vasu...very helpful

    ReplyDelete
  4. This will surely help the DS community. - Dan

    ReplyDelete

Post a Comment

Please feel free to share your thoughts and discuss.

Other popular job openings

IDFC FIRST Bank is hiring for an fresher entry level Associate Data Scientist role in India

Some people notified us at 12:30PM on 4th July that the opening is filled and they are not accepting any more applications, please check the other recently posted entry level openings here .    Position: Associate Data Scientist Part of the Data & Analytics function Focused on developing and deploying Generative AI and LLM-based solutions Requires technical expertise in machine learning, NLP, and large-scale data handling Company: IDFC FIRST Bank A new-age Indian bank formed through the merger of IDFC Bank and Capital First Committed to ethical, customer-centric, and tech-driven banking Offers services across retail, MSME, startup, rural, and corporate segments Location: Mumbai, Maharashtra, India Job type: Full-time employment Job mode: Onsite Job requisition id: P-177940 Years of experience: 0 to 2 years Company Description IDFC FIRST Bank was formed after the merger of IDFC Bank and Capital First in 2018 and started full-fledge...

eBay is hiring for an fresher entry level Data Analyst role in India

Position: Data Operations Analyst (It is similar to Data Analyst role only) Company: eBay Inc. Location: Bengaluru, India Job type: Full-time Job mode: Onsite Job requisition id: R0068011 Years of experience: 0 to 3 years Company Description eBay is a global ecommerce platform that enables buying and selling in more than 190 markets worldwide. The company has a mission to empower individuals, businesses, and communities through economic opportunity. It was founded in 1995 and has remained committed to shaping the digital commerce space. eBay believes in a culture where authenticity is celebrated and bold ideas are welcomed. The company encourages employees to bring their unique selves to work, contributing to a shared goal of innovation and inclusion. eBay focuses heavily on community-building, aiming to create strong ecosystems for both buyers and sellers. The business operates at an enormous scale, processing millions of transactions daily....

AB InBev is hiring for a fresher entry level Junior Data Scientist role in India

Position: Junior Data Scientist – Predictive Forecasting Company: AB InBev (GCC Services India Pvt. Ltd.) Location: Bengaluru, Karnataka, India Job type: Full-time Job mode: Onsite Job requisition id: 30084171 Years of experience: 0–3 years Company Description (in bullet format) AB InBev is recognized as the world’s largest brewing company. Its vision extends beyond business, striving to bring people together through the culture of beer. The company owns more than 500 beer brands, including globally known names such as Budweiser, Corona, and Stella Artois. AB InBev is committed to sustainable growth that benefits communities, consumers, farmers, and partners alike. Their mission focuses on building a company that will last for a century, investing in both people and local ecosystems. With a long history of uniting cultures over a beer, they continue to evolve in today’s fast-paced, connected world. AB InBev integrates innovation and tradition,...

Khan Academy is hiring for an fresher entry level Data Analyst role in India

Position: Data Analyst – Partner Operations Company: Khan Academy India Location: Delhi, Gurgaon, Noida (Remote Friendly) Job type: Full-time Job mode: Remote Friendly Job requisition id: Not specified Years of experience: 0–3 years Company description Khan Academy is a nonprofit organization committed to providing high-quality education to anyone, anywhere, without cost. The platform is widely recognized and used by millions of learners globally every month. The organization provides a robust collection of instructional and practice resources that allow learners to progress at their own pace. The team works with an innovative mindset, similar to a startup, to build and scale effective learning tools. In India, Khan Academy functions through a dedicated local team that curates and aligns content with regional education systems. Since 2016, the India division has seen rapid growth, now serving nearly 4 million learners every month across various ...

Blenheim Chalcot is hiring for an fresher entry level Data Analyst role in India

Position: Data Analyst – Investment Team Company: Blenheim Chalcot India (for its FinTech venture BCI Finance) Location: Mumbai, India Job type: Full-time Job mode: Onsite Job requisition id: Not specified Years of experience: 0 to 3 years Company description Blenheim Chalcot is widely recognized as a global leader in building and scaling innovative ventures. With over 26 years of entrepreneurial success, they’ve developed a diverse portfolio of companies spanning FinTech, EdTech, GovTech, Media, Sports, and more. Their ventures are strongly powered by GenAI, reflecting their commitment to staying at the forefront of digital innovation. Blenheim Chalcot India, launched in 2014, plays a key role in the global operations, supporting ventures with critical services like tech, HR, legal, marketing, finance, talent, and tax. The company aims to empower ambitious professionals to lead and deliver market-disrupting products and services. BCI Finance, o...

IBM is hiring for a Data Analyst Intern role in India

Position: Data Analyst Intern Company: IBM India Private Limited Location: Gurgaon, Haryana and Bangalore, Karnataka, India Job type: Internship (Fixed Term) Job mode: Hybrid Job requisition id: 44158 Years of experience: None Company description: IBM, established in 1911, is one of the world’s oldest and largest technology companies. Known globally for its legacy in computing, IBM has transformed into a hybrid cloud and AI powerhouse. The company serves clients in over 170 countries and collaborates with Fortune 50 companies for digital transformation and modernization. IBM’s consulting wing, IBM Consulting, plays a pivotal role in delivering expert guidance in business and technology services. With innovations across AI, quantum computing, blockchain, and hybrid cloud infrastructure, IBM continues to lead global tech evolution. IBMers, the term for employees, are known for their problem-solving capabilities and innovative mindset. The compan...

HiLabs is hiring for a fresher entry level Data Scientist role in India

Position: Data Scientist Company: HiLabs Location: Bangalore, Karnataka, India Job type: Full-time Job mode: Onsite Job requisition id: Not explicitly provided Years of experience: 0–3 years Company description HiLabs is a data-driven company focused on transforming the healthcare ecosystem. It builds AI-powered platforms that clean, refine, and utilize large datasets to solve major inefficiencies in the healthcare sector. With innovation at its core, HiLabs combines cutting-edge machine learning with deep healthcare expertise. The company employs professionals from world-renowned academic institutions such as Harvard, Yale, Carnegie Mellon, Duke, Georgia Tech, IIMs, and IITs. It works at the intersection of AI, big data, and healthcare delivery to create value-based systems. HiLabs fosters a collaborative environment of engineers, clinicians, and data scientists aiming to enhance patient care. Its mission is to unlock the power of dirty and c...

LSEG is hiring for an fresher entry level Data Scientist role in India

Position: Data Scientist Company: London Stock Exchange Group (LSEG) Location: Bangalore, India (offices at RMZ Infinity and Divyasree Technopolis) Job type: Full-time Job mode: Onsite Job requisition id: R0106477 Years of experience: 0–3 years Company Description: LSEG (London Stock Exchange Group) is a globally recognized leader in financial market infrastructure and data solutions. The company plays a critical role in maintaining global financial stability by offering trusted platforms, infrastructure, and insights that power the financial world. With operations spanning 65 countries and a workforce of over 25,000 professionals, LSEG combines financial expertise with cutting-edge technology to deliver services across capital markets, data analytics, and post-trade solutions. LSEG’s values—Integrity, Partnership, Excellence, and Change—are central to its culture and drive all major decisions and interactions. The company focuses heavily on susta...

Group Bayport is hiring for an fresher entry level Data Analyst role in India

Position: Customer Insights Analyst (Data Analyst) Open for freshers and early-career professionals Focus on Contact Center Customer Quality, VOC programs, and process enhancement Company: Group Bayport A global leader in printing services Operating across multiple brands including BannerBuzz, Covers & All, Vivyx Printing, and more Workforce of over 1450+ employees Known for its entrepreneurial culture, rapid expansion, and employee-driven innovation Location: Gurugram, India Onsite role requiring physical presence in the Gurgaon office Job type: Full-time position Long-term opportunity with potential for internal growth Job mode: Onsite Requires day-to-day collaboration with cross-functional teams Job requisition id: Not specified Years of experience: 0 to 3 years of relevant experience accepted Suitable for freshers with the right skillset and mindset Company description: Group Bayport started as a one-person entrepreneurial star...

UiPath is hiring for an fresher entry level Data Scientist role in India

Position: Data Scientist Company: UiPath Location: Pune, India Job type: Full-time Job mode: Onsite (with some hybrid flexibility as per team needs) Job requisition id: Not explicitly listed Years of experience: 0–3 years Company description UiPath is a global software company recognized for its leadership in the field of automation and enterprise solutions. The company is dedicated to transforming how businesses operate using automation technologies. UiPath believes in the power of automating repetitive processes to increase efficiency and allow people to focus on more valuable and creative work. The workplace culture at UiPath is built around values such as curiosity, collaboration, innovation, and personal growth. Employees are encouraged to take initiative, support their colleagues, and contribute meaningfully to team and organizational goals. Diversity and inclusivity are strongly promoted, ensuring opportunities are accessible to everyone ...