wisemonkeys logo
FeedNotificationProfileManage Forms
FeedNotificationSearchSign in
wisemonkeys logo

Blogs

DATA WRANGLING

profile
Yogita Sahu
Oct 14, 2024
0 Likes
0 Discussions
105 Reads

Data Wrangling


Data wrangling (or data munging) data ko involve karta h cleaning and transforming raw data ko convert karnke format ko analyse karta hai. It includes various processes:


1. Data Cleaning:

   Handling Missing Values: Techniques include imputation (mean, median, mode), removal of missing entries, or is algorithms ka use karke missing data handle kiya jata hai.

  Removing Duplicates: Identifying and eliminating duplicate records to ensure data integrity.


2. Data Transformation:

  Normalization: Adjusting values to a common scale.

  Encoding Categorical Variables: Converting categorical data into numerical format using techniques like one-hot encoding or label encoding.


3. Feature Engineering:

  Creating new features or puraane features ka use karke better improve model performance, such as combining date and time into a single feature or extracting domain-specific metrics.


4. Data Integration:

  Combining data from multiple sources to create a unified dataset, jisme merging data frames or databases involve hota h


5. Outlier Detection and Treatment:

  Identifying and Decide ki kaise handle kar sakte h outliers, jisme involve ho sake removal, transformation, or capping.


6. Reshaping Data:

   pivot tables, melting, or stacking ka use karke format change kiya jata hai dataset ke liye taki better analysis or visualization ho sake .

 

 Tools and Libraries


Pandas: A powerful Python library for data manipulation and analysis, offering functions for scaling, cleaning, and wrangling.

NumPy: Useful for numerical operations and handling arrays.





Comments ()


Sign in

Read Next

What is thread and multithreading ?

Blog banner

Threads

Blog banner

The Evolution of Operating Systems

Blog banner

RAID

Blog banner

Data carving - using hex editor

Blog banner

Process in OS

Blog banner

Mesh Topology

Blog banner

Riddhi Miyani 53003220140

Blog banner

The functions of operating system

Blog banner

Advanced Persistent Threats (APTs)

Blog banner

ZOHO

Blog banner

"Audit" In Data Science

Blog banner

Memory Partitioning

Blog banner

Python as a tool for Data science task & project

Blog banner

Assignment-3

Blog banner

The Essential Guide to Dynamic Arrays vs. Linked Lists: Which to Use and When ?

Blog banner

memory cache

Blog banner

Electronic data interchange

Blog banner

Compromising Mobile Platforms

Blog banner

K-means use cases

Blog banner

Data Security and Data Privacy in Data Science

Blog banner

Making Money through Instagram

Blog banner

Brain wash of social media

Blog banner

GIS info about Bermuda Triangle

Blog banner

GIS in Mapping and landslide alert in Bangladesh

Blog banner

MODERN OPERATING SYSTEM

Blog banner

Analysis of Digital Evidence In Identity Theft Investigations

Blog banner

Current Trends in GIS and Remote Sensing(Ocean Applications)

Blog banner

Deadlock

Blog banner

Benefits of yoga and meditation

Blog banner

Multiprocessor and Multicore Organization

Blog banner

Modern Operating system

Blog banner

Multiprocessor and Multicore Organization

Blog banner

Direct Memory Access

Blog banner

Linux -V Server Virtual Machine

Blog banner

Service Operations in ITSM

Blog banner

Autonomy Vehicles: Future Ki Gadiya

Blog banner

Facebook Shut Down an AI Program!!! Facebook AI bots became Terminators???

Blog banner

Is it important to follow all the trends that come up on social media?

Blog banner

How To Implement Search Engine Marketing (Sem) Strategy Effectively

Blog banner

Career v/s Job : Choose your passion

Blog banner

"Life as a Part-time Student"

Blog banner