Skip to main content

Sharpen your teeth in Data Analytics (and maybe create a portfolio while you’re at it)


 

As a novice data analyst myself, I truly understand the mind of another beginner rearing to go out in the world and trying to tame a wild data set and analysing the heck out of it. Understandable, it is then, the urge to pick up the same old Kaggle data set that is begging not to be analysed for the umpteenth time, all whilst arriving at the same insights as the 24,637 people that came before you. One must not be utterly taken aback then, when they see the face of their interviewer (seeing the same analysis thirteenth time that day) turn red in anger without application of any sort of conditional formatting.

So, one might naturally enquire into the remedy to such an ailment. I would then point you towards the less traversed yet endlessly fascinating direction of obscure publicly available datasets. From an exhaustive data set of all known passengers of the RMS Titanic to the largest reference data set of the human genome, not only do they make for remarkably interesting candidates for analytics projects, but they also set you apart in the eyes of interviewers. 

The variety of data out there is so diverse, every budding data enthusiast is bound to discover something that piques their analytical interest.


So here are some of my favourites that you can check out:
  1. Data.gov.in: Do your parents boast that they used to shop for a whole week’s worth of groceries all for 15 rupees in 2009? Bust out the last 15 years of Consumer Price index (CPI) data from the Government of India’s official data repository to prove them wrong, once and for all. Developed by the National Informatics Centre (NIC) under the aegis of Ministry of Electronics and Information Technology (aka MEITy), Data.gov.in has data from more than 6,00,000 resources including crime, judiciary, urban and sports. At this data heaven, everyone is sure to find something that would make a worthy addition to their portfolio. Not only this, but you can have access to a wide selection of their APIs as well. Bonus points for no sign ups necessary.

  2. Awesome public datasets (GitHub): From Swiss apartment models to the biggest crowdsourced database of American gut biome, ‘Awesome public datasets’ is a source of admittedly more global, yet no less amusing datasets which one can explore in search for their next project. These datasets were painstakingly collected and tidied from blogs and user responses. Most of them are absolutely free and part of the open source movement. Again, no obligation to sign up!

  3. Sindresorhus’s Awesome Collection (GitHub): This list, my friends! Is the GitHub equivalent to Sir Ravindra Jadeja because of the all-rounder variety of resources it holds. Not only is it home to learning resources ranging from fintech to Generative AI, it also holds free books, public datasets and much more. This list is a one stop shop to learn anything and everything! Do, however, make sure to have blinders on while you visit this page, otherwise you’re guaranteed to be distracted along the way (speaking from personal experience).

  4. Figshare: This one is for all the academically minded folks out there. Figshare has an endless trove of datasets from close to 25 categories ranging from economics to earth sciences. Be it China’s Covid-19 case data from January 2020 or the species of native plants in any state of the US, if you can think of it, this data repository probably has it. With a clean UX, it successfully distinguishes itself from the typical academic website, making it easier for a newbie to find his way around. The good news? You can download up to 20 GB of this data for FREE! (thank me later)

  5. Google Trends: Did you know that mentions of the term “big data” peaked in October 2018? Wanna know why? Then this is my homework for you to find out through Google’s own repository of all things trends and keywords. Alright! I must admit this one isn’t very “obscure” but deserves a mention, nonetheless. A pioneer in “nowcasting”, google trends is the back bone for all sorts of projects to get real time updates, the OECD’s weekly GDP tracker being a good example. Being the good Samaritans they are, they have an extremely helpful section right upfront to teach newbies how to make the most of this data as well.

  6. World Bank Open data: Wonder how the GDP of the nations of the world has changed over the past 30 years? No biggie, the World Bank has you covered! With it’s ‘World Bank Open Data’ initiative, it has made a true wealth of financial and fiscal data available to the masses. This data is available to download in CSV, XML and Excel formats along with access to their own data bank and thematic tables for easy understanding. All at one click of a button!

  7. OECD Data Explorer: In their own words, The Organisation for Economic Cooperation and Development (OECD) (phew) is an international organisation working towards making better policies for better lives. But they’re not all talk, they’ve made available data ranging from Tobacco consumption, Variation in Body weight between nationalities and Wildfires, to one and all. This excellent selection of data can help you analyse everything from levels of alcoholism between states in India to the variation of occurrence of obesity within the country. Truly a great way to spend one’s Saturday, don’t you think? (just kidding)

  8. UCI Machine learning repository: Focused towards machine learning enthusiasts, this is an excellent repository of more than 600 datasets for all the newbies trying to get themselves familiar with ML. This will ensure that you go from being an ML clueless to an ML connoisseur in no time!

So, I hope that equipped with these sources, you will make your portfolio stand out like a kangaroo in a penguin enclosure. Do always remember the wise words of Franklin D. Roosevelt, “The only thing we have to fear is fear itself, and maybe not backing up important data” (don’t quote me on that though).

Till we meet again, Data comrades!
 
About Author
Author Photo
Vasudev Pandey
I am a budding data scientist and mechatronics engineer with a passion for history and finance. I write about anything and everything I find interesting.

Comments

  1. Thank you for sharing these public datasets with the TakeOff Talent community, Vasudev. This will surely help many people who are confused around how to build a portfolio in analytics as well as in data science.

    ReplyDelete
  2. great article, thank you so much for sharing vasudev

    ReplyDelete
  3. Thanks Vasu...very helpful

    ReplyDelete
  4. This will surely help the DS community. - Dan

    ReplyDelete

Post a Comment

Please feel free to share your thoughts and discuss.

Other popular job openings

Honeywell is hiring for a fresher entry level Data Scientist I role in India

Position: Data Scientist I Entry-level position in Honeywell’s Data & Analytics function Title indicates the beginning tier of Honeywell’s structured Data Scientist career path Company: Honeywell International Inc A global leader in aerospace, building automation, energy solutions, and industrial technologies Known for combining engineering with digital innovation Honeywell Forge software powers its advanced analytics, data-driven operations, and automation systems Location: Bengaluru, Karnataka, India Office located in Devarabisanahalli Village, KR Varturhobli, Bangalore, KA, 560103 Positioned in one of India’s most dynamic technology hubs with access to skilled professionals and global collaboration Job type: Full-time position Standard corporate working schedule aligned with global Honeywell operations Permanent role providing long-term career growth opportunities Job mode: Onsite role in Bengaluru Collaboration expected with cross-functiona...

Wells Fargo is hiring for a fresher entry level Data Analyst Intern role in India

Position: Intern Analyst, Advanced Data and Analytics Program Early careers internship role designed to expose fresh talent to corporate services and data-focused assignments Provides structured learning and practical exposure to real-world business challenges Company: Wells Fargo, one of the largest global financial services firms Known for banking, lending, investments, risk management, and corporate solutions Strong emphasis on customer trust, innovation, compliance, and inclusive work culture Location: Bengaluru, India May involve opportunities to collaborate with teams in Hyderabad and other Indian cities Work carried out within a professional corporate office environment Job type: Full-time internship Corporate and administrative services function Designed to combine structured training with practical assignments Job mode: Onsite with collaboration in office-based settings Provides exposure to real-world team dynamics and professional culture ...

Mastercard is hiring for a fresher entry level Associate Data Analyst role in India

Position: Associate Data Analyst, R Programmer-1 Entry-level opportunity for fresh graduates and early career professionals Focused on data programming, visualization, and economic insights using R and related tools Company: Mastercard A leading global payments and technology company Operating in over 200 countries and territories worldwide Known for empowering businesses, governments, and individuals with secure and innovative financial solutions Location: Gurgaon, India, 122002 A major financial and technology hub in India Positioned within Mastercard’s broader global software engineering and economics institute teams Job type: Full-time employment Regular employee status with fixed working hours aligned to Mastercard’s corporate structure Job mode: Onsite, office-based role with collaboration in team settings Likely hybrid elements based on team requirements but primarily designed for in-person contributions Job requisition id: R-253297 This...

Swiggy is hiring for a fresher entry level Data Scientist role in India

Position: Data Scientist I Entry-level position designed for candidates who are beginning their careers in data science and applied machine learning Focus on solving business-critical problems using mathematical modeling, machine learning, and data-driven decision making Core responsibilities include applying ML/DL techniques, statistical models, and optimization strategies to real-world problems at scale Company: Swiggy Leading food delivery and convenience platform in India Known for integrating advanced technology and data science into its everyday operations to improve customer experiences and business outcomes Operates across multiple verticals including food delivery, grocery, and logistics services Location: Bangalore, Karnataka, India Primary hub for Swiggy’s data science and technology teams Access to Swiggy’s large ecosystem of operational, customer, and business data for experimentation and production work Job type: Full-time position Permanen...

Indian Institute of Science (IISc) is hiring for a fresher entry level Data Science Research Associate role in India

Position: Data Science Research Associate Project-based staff role within the VISTA Lab at the Indian Institute of Science (IISc) Focus on applied machine learning, artificial intelligence, data science, and cybersecurity research Contributing to high-impact projects in the fields of automotive cybersecurity and anomaly detection for financial technology (A higher salary up to 100,000 INR per month can be offered to exceptional candidates)   Company: Indian Institute of Science (IISc) A premier institute in India for advanced research and education in science and technology Established in 1909 through the joint efforts of Jamsetji Tata, the Government of India, and the Maharaja of Mysore Globally recognized for its contributions to fundamental and applied research across multiple domains Location: Bengaluru, Karnataka, India IISc’s main campus located in North Bengaluru, spread across 400 acres State-of-the-art laboratories, research centers, and academi...

FedEx is hiring for a multiple Business Data Analyst role at all the levels (Fresher to Senior)

Position: Business Data Analyst – Associate Entry-level professional role in data analysis with strong emphasis on business process analysis, IT solution support, and stakeholder communication Company: Federal Express Corporation (FedEx) – AMEA Region Globally recognized logistics and express transportation provider Known for its people-first philosophy, innovation in logistics, and consistent inclusion in the Fortune “World’s Most Admired Companies” list Location: Pune, Maharashtra, India Gurugram, Haryana, India Bengaluru, Karnataka, India Job type: Full-time, professional category Regular employee (not contractual or temporary) Weekly hours: 48 Job mode: Onsite Candidates will be required to work from office locations in India Job requisition id: RC754162 Years of experience: Associate: No prior experience required (suitable for freshers) Standard I: 2 years Standard II: 3 years Senior I: 4 years Senior II: 5 years Company descrip...

WorldQuant is hiring for a fresher entry level Data Scientist role in India

Position: Data Scientist, BRAIN Core Team A role focused on transforming raw data into actionable insights Contributes to developing decision-support tools, adopting AI/LLM innovations, and building scalable data intelligence solutions Company: WorldQuant LLC, a global quantitative asset management firm Known for creating systematic financial strategies across a variety of asset classes and global markets Strong reputation for crowdsourcing signals, innovative AI research, and continuous learning culture WorldQuant BRAIN division, operating like a fintech platform, empowers individuals worldwide to contribute signals and data Location: Mumbai, India Strategic global office location, enabling collaboration across time zones Situated in a financial and technology hub, offering access to talent and industry networks Job type: Full-time opportunity Permanent employment within the BRAIN Core Team Structured for long-term growth and contribution Job mode: ...

Kantar is hiring for a fresher entry level Analyst (Consultant) role in India

Position: Data Science & Analytics Consultant role within the Analytics division of Kantar Focused on brand, marketing, customer analytics, and data-driven decision-making Role designed for individuals who can blend business understanding with technical expertise Company: Kantar, a global leader in data, insights, and consulting Recognized for combining human understanding with cutting-edge technology solutions Serves global organizations to decode consumer behavior and support strategic growth decisions Location: Bangalore, Prestige Technology Park, India Work environment situated in a major technology hub with exposure to global clients and projects Job type: Full-time opportunity Designed for professionals ready to dedicate their efforts toward advanced analytics and consulting projects Job mode: Onsite role in Bangalore office Regular in-person collaboration with team members and clients to deliver high-quality projects Job requisition id: R0...

American Express is hiring for a fresher entry level Data Analyst role in India

Position: Analyst, Data Analytics Focused on Forecasting within Global Capacity and Contact Management (GCCM) A data-driven role that combines statistical analysis, predictive modeling, and business impact Company: American Express Global leader in financial services, payments, and customer experience solutions Known for pioneering innovation and customer backing for more than 175 years Location: Gurugram, Haryana, India Job type: Full-time Job mode: Hybrid, combining in-office collaboration with remote flexibility Job requisition id: 25015468 Years of experience: 0 to 3 years of professional experience accepted Company description American Express is one of the most recognized and respected companies in the world, known for setting benchmarks in financial services and customer loyalty programs. For more than 175 years, the company has stood at the intersection of innovation, reliability, and customer centricity. The organization provides a ra...

NTT DATA is hiring for a fresher entry level Data Analyst role in India

Position: Data Analyst Company: NTT DATA North America Location: Noida, Uttar Pradesh, India Job type: Full-time Job mode: Onsite with possible hybrid flexibility depending on project needs Job requisition id: 338365 Years of experience: 0 to 3 years Company Description NTT DATA is recognized as a trusted global innovator in technology and business services, operating in over 50 countries with a strong emphasis on creating sustainable digital futures. The company generates annual revenues of more than 30 billion dollars and partners with 75 percent of the Fortune Global 100, making it one of the largest IT and consulting service providers worldwide. Its services are broad and span across areas such as digital transformation, advanced data and AI solutions, industry-specific consulting, and enterprise infrastructure management. The company maintains a strong network of collaborations with both established businesses and emerging start-ups, giving it ...