Class 9 Artificial Intelligence Code 417 Solutions
Session 2025 - 26
Artificial Intelligence class 9 code 417 NCERT Section - A and Section B solutions PDF and Question Answer are provided and class 9 AI book code 417 solutions. Class 9 AI Syllabus. Class 9 AI Book. First go through Artificial Intelligence (code 417 class 9 with solutions) and then solve MCQs and Sample Papers, information technology code 417 class 9 solutions pdf is the vocational subject.
--------------------------------------------------
Chapter - Data Literacy
Other Topics
📌 Data Acquisition
Definition:
Data Acquisition is the process of collecting and bringing in data from different sources for analysis, decision-making, and training AI/ML systems. It can involve raw data collection, integration, cleaning, and storing.
![]() |
Data Acquisition |
🔹 Key Components of Data Acquisition
Data Discovery
Finding and identifying relevant data sources.
Example: Searching customer transaction records, website logs, or open government datasets.Data Augmentation
Enhancing existing data by adding extra context or external datasets.
Example: Adding weather data to retail sales to see how weather affects shopping.Data Generation
Creating new synthetic data when real data is insufficient or missing.
Example: Using AI to generate additional product images for training a recommendation system.E-Commerce
💡 Case Study: E-commerce Personalization
Scenario:
An e-commerce company wants to improve product recommendations.
Data Discovery
Collects user browsing history, past purchases, and search queries.Data Augmentation
Adds social media sentiment data and weather patterns to see external influence on buying behavior.Data Generation
Generates synthetic click stream data (fake but realistic browsing patterns) to simulate rare scenarios, like holiday sales.
Outcome:
- Better personalized product recommendations.
- Increased sales by predicting what a customer might buy next.
- Improved decision-making for marketing campaigns.
Sources of Data
Sources of Data refer to the origins from where data is collected. Broadly, data can come from primary or secondary sources:
1. Primary Sources (Direct collection of fresh/original data)
- Surveys & Questionnaires – customer feedback forms, opinion polls
- Interviews – one-on-one discussions, expert talks
- Experiments – lab tests, A/B testing in digital marketing
- Observations – tracking user behavior, field studies
- Sensors/IoT Devices – temperature sensors, fitness trackers
2. Secondary Sources (Pre-existing data collected by others)
- Government Publications – census data, economic surveys
- Research Articles & Journals – academic studies
- Company Records – sales reports, HR databases
- Web & Social Media Data – Twitter, Facebook analytics, web scraping
- Databases & Repositories – Kaggle, UCI ML repository
Classification by Nature
- Structured Data – databases, spreadsheets
- Unstructured Data – images, videos, social media posts
- Semi-structured Data – JSON, XML logs
👉 In short, data sources are everywhere—from what we create (social media posts), what we measure (sensor data), to what we analyze (reports and studies).
Best Practices for Data Acquisition
When we talk about Best Practices for Acquiring Data, we mean the smart, ethical, and efficient ways of collecting data so that it remains reliable, accurate, and useful.
1. Define Your Purpose Clearly
- Before collecting data, ask: “Why do I need this data?”
- Align data acquisition with business, research, or project goals.
2. Choose the Right Sources
- Use primary sources (surveys, experiments) for fresh, targeted data.
- Use secondary sources (government reports, databases) when reliable existing data is available.
3. Ensure Data Quality
- Validate accuracy by cross-checking sources.
- Remove duplicates, missing values, and inconsistencies.
- Collect data in a standardized format.
Best Practices for Acquiring Data
4. Respect Privacy and Ethics
- Follow data protection laws (like GDPR or India’s DPDP Act).
- Collect only the data you need.
- Always ensure user consent if personal data is involved.
5. Automate Where Possible
- Use APIs, IoT devices, and automated scripts to collect real-time data.
- Reduces human error and increases efficiency.
6. Secure Data During Collection
- Use encryption when transferring data.
- Protect storage with authentication and access control.
7. Document the Process
- Keep track of where, how, and when data was collected.
- This improves transparency, reproducibility, and trust.
✅ In short:
Good data acquisition = Relevant + High-quality + Ethical + Secure data.
Checklist of Factors that Make Data Good or Bad:
✅ Good Data Qualities
-
Accuracy – Correct, free from errors or misrepresentations.
-
Completeness – No missing values; all required fields are filled.
-
Consistency – Same format, structure, and values across datasets.
-
Timeliness – Up-to-date and relevant at the time of use.
-
Relevance – Matches the problem or decision-making need.
-
Validity – Fits within defined rules, formats, or ranges.
-
Reliability – Collected from trustworthy and credible sources.
-
Accessibility – Easy to access and retrieve when needed.
-
Granularity – Sufficiently detailed for analysis.
Uniqueness – No duplicates or redundant records.
❌ Bad Data Qualities
-
Inaccurate – Contains typos, misclassifications, or wrong values.
-
Incomplete – Missing fields or blank values that reduce usefulness.
-
Inconsistent – Conflicting formats or mismatched values across datasets.
-
Outdated – Old information that no longer reflects reality.
-
Irrelevant – Doesn’t answer the intended question.
-
Invalid – Breaks rules (e.g., letters in a phone number field).
-
Unreliable – From unknown, biased, or untrustworthy sources.
-
Hard to Access – Locked in silos or inaccessible formats.
-
Too Vague – Lacks detail to support decisions.
Duplicate/Redundant – Same records appearing multiple times.
Data Acquisition from Websites
Data Acquisition from Websites is the process of collecting data that is publicly available (or accessible via permission) on websites. This is a common method for building datasets for research, business insights, or AI applications.
🔑 Methods of Data Acquisition from Websites
Web Scraping
- Using tools or scripts to extract structured data from web pages.
- Example: Scraping e-commerce product details (price, rating, reviews).
- Tools: BeautifulSoup, Scrapy, Selenium, Puppeteer.
APIs (Application Programming Interfaces)
- Many websites provide APIs for legal and structured data access.
- Example: Twitter API for tweets, YouTube API for video stats.
- Advantage: Cleaner and faster than scraping.
RSS Feeds & XML/JSON Endpoints
- Some websites expose data through feeds.
- Example: News websites providing RSS feeds.
Open Data Portals
- Government and organizations publish data for public use.
- Example: Data.gov, World Bank Open Data.
Browser Automation
- When data is dynamic (loaded via JavaScript), automation tools like Selenium simulate user interaction to capture it.
✅ Best Practices
- Always check website terms of service (avoid illegal scraping).
- Prefer official APIs over scraping.
- Use rate limiting to avoid overloading servers.
- Ensure data cleaning & validation after acquisition.
⚡ Real-life Example
An online travel company collects flight prices from multiple airline websites.
- Method: Scraping + API.
- Use case: Build a price comparison tool.
- Challenge: Dynamic content & frequent site changes.
- Solution: API + automated scrapers updated regularly.
⚖️ Ethical Concerns in Data Acquisition
When we talk about Ethical Concerns in Data Acquisition,
the focus is on how data is collected, stored, and used. Even if data
is technically available, it doesn’t always mean it’s ethical to take or
use it.
![]() |
Ethical Concerns |
1. Privacy Violations
- Collecting personal information (emails, phone numbers, location) without consent.
- Example: Scraping social media profiles for sensitive details.
2. Consent & Transparency
- Users often don’t know their data is being collected.
- Ethical practice: Inform users and obtain clear consent.
3. Data Ownership
- Who owns the data? The user, the platform, or the company acquiring it?
- Misuse: Taking proprietary datasets without permission.
4. Bias & Fairness
- Collected data may be incomplete or biased.
- Example: Training AI on data that represents only certain groups → leads to unfair outcomes.
5. Security Risks
- Storing data irresponsibly can cause leaks and breaches.
- Example: A scraped dataset of credit card details being exposed.
6. Legal vs Ethical Boundaries
- Some practices may be legal but still unethical.
- Example: Collecting health data from forums, then using it for marketing without user awareness.
7. Overuse of Web Resources
- Aggressive scraping can harm websites, slowing down servers.
- Ethical approach: Respect robots.txt, rate limits, and fair usage.
✅ Best Ethical Practices
- Collect only what is necessary (data minimization).
- Use anonymization & encryption.
- Follow GDPR, HIPAA, CCPA regulations.
- Be transparent: tell users what you’re collecting and why.
- Respect website terms of service.
⚡ Real-Life Case Study
A research group scraped 70,000 dating profiles for academic study without informing users. The data was later published online → led to privacy backlash and ethical criticism, even though it wasn’t strictly illegal.
No comments:
Post a Comment