Python for data analysis : (Record no. 22686)

MARC details
000 -LEADER
fixed length control field 12611nam a22002417a 4500
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
ISBN 9789355421906
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.133
Item number MCK
100 1# - MAIN ENTRY--AUTHOR NAME
Author name McKinney, Wes,
245 10 - TITLE STATEMENT
Title Python for data analysis :
Sub Title data wrangling with Pandas, NumPy, and Jupyter /
Statement of responsibility, etc Wes McKinney.
250 ## - EDITION STATEMENT
Edition statement 3rd ed.
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication Beijing :
Name of publisher O'Reilly Media ,
Year of publication 2022.
300 ## - PHYSICAL DESCRIPTION
Number of Pages xvi, 561 pages :
Other physical details illustrations ;
Dimensions 24 cm
500 ## - GENERAL NOTE
General note Includes index.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi<br/>1. Preliminaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.1 What Is This Book About? 1<br/>What Kinds of Data? 1<br/>1.2 Why Python for Data Analysis? 2<br/>Python as Glue 3<br/>Solving the “Two-Language” Problem 3<br/>Why Not Python? 3<br/>1.3 Essential Python Libraries 4<br/>NumPy 4<br/>pandas 5<br/>matplotlib 6<br/>IPython and Jupyter 6<br/>SciPy 7<br/>scikit-learn 8<br/>statsmodels 8<br/>Other Packages 9<br/>1.4 Installation and Setup 9<br/>Miniconda on Windows 9<br/>GNU/Linux 10<br/>Miniconda on macOS 11<br/>Installing Necessary Packages 11<br/>Integrated Development Environments and Text Editors 12<br/>1.5 Community and Conferences 13<br/>1.6 Navigating This Book 14<br/>Code Examples 15<br/>iii<br/>Data for Examples 15<br/>Import Conventions 16<br/>2. Python Language Basics, IPython, and Jupyter Notebooks. . . . . . . . . . . . . . . . . . . . . . . . 17<br/>2.1 The Python Interpreter 18<br/>2.2 IPython Basics 19<br/>Running the IPython Shell 19<br/>Running the Jupyter Notebook 20<br/>Tab Completion 23<br/>Introspection 25<br/>2.3 Python Language Basics 26<br/>Language Semantics 26<br/>Scalar Types 34<br/>Control Flow 42<br/>2.4 Conclusion 45<br/>3. Built-In Data Structures, Functions, and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br/>3.1 Data Structures and Sequences 47<br/>Tuple 47<br/>List 51<br/>Dictionary 55<br/>Set 59<br/>Built-In Sequence Functions 62<br/>List, Set, and Dictionary Comprehensions 63<br/>3.2 Functions 65<br/>Namespaces, Scope, and Local Functions 67<br/>Returning Multiple Values 68<br/>Functions Are Objects 69<br/>Anonymous (Lambda) Functions 70<br/>Generators 71<br/>Errors and Exception Handling 74<br/>3.3 Files and the Operating System 76<br/>Bytes and Unicode with Files 80<br/>3.4 Conclusion 82<br/>4. NumPy Basics: Arrays and Vectorized Computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br/>4.1 The NumPy ndarray: A Multidimensional Array Object 85<br/>Creating ndarrays 86<br/>Data Types for ndarrays 88<br/>Arithmetic with NumPy Arrays 91<br/>Basic Indexing and Slicing 92<br/>iv | Table of Contents<br/>Boolean Indexing 97<br/>Fancy Indexing 100<br/>Transposing Arrays and Swapping Axes 102<br/>4.2 Pseudorandom Number Generation 103<br/>4.3 Universal Functions: Fast Element-Wise Array Functions 105<br/>4.4 Array-Oriented Programming with Arrays 108<br/>Expressing Conditional Logic as Array Operations 110<br/>Mathematical and Statistical Methods 111<br/>Methods for Boolean Arrays 113<br/>Sorting 114<br/>Unique and Other Set Logic 115<br/>4.5 File Input and Output with Arrays 116<br/>4.6 Linear Algebra 116<br/>4.7 Example: Random Walks 118<br/>Simulating Many Random Walks at Once 120<br/>4.8 Conclusion 121<br/>5. Getting Started with pandas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123<br/>5.1 Introduction to pandas Data Structures 124<br/>Series 124<br/>DataFrame 129<br/>Index Objects 136<br/>5.2 Essential Functionality 138<br/>Reindexing 138<br/>Dropping Entries from an Axis 141<br/>Indexing, Selection, and Filtering 142<br/>Arithmetic and Data Alignment 152<br/>Function Application and Mapping 158<br/>Sorting and Ranking 160<br/>Axis Indexes with Duplicate Labels 164<br/>5.3 Summarizing and Computing Descriptive Statistics 165<br/>Correlation and Covariance 168<br/>Unique Values, Value Counts, and Membership 170<br/>5.4 Conclusion 173<br/>6. Data Loading, Storage, and File Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br/>6.1 Reading and Writing Data in Text Format 175<br/>Reading Text Files in Pieces 182<br/>Writing Data to Text Format 184<br/>Working with Other Delimited Formats 185<br/>JSON Data 187<br/>Table of Contents | v<br/>XML and HTML: Web Scraping 189<br/>6.2 Binary Data Formats 193<br/>Reading Microsoft Excel Files 194<br/>Using HDF5 Format 195<br/>6.3 Interacting with Web APIs 197<br/>6.4 Interacting with Databases 199<br/>6.5 Conclusion 201<br/>7. Data Cleaning and Preparation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203<br/>7.1 Handling Missing Data 203<br/>Filtering Out Missing Data 205<br/>Filling In Missing Data 207<br/>7.2 Data Transformation 209<br/>Removing Duplicates 209<br/>Transforming Data Using a Function or Mapping 211<br/>Replacing Values 212<br/>Renaming Axis Indexes 214<br/>Discretization and Binning 215<br/>Detecting and Filtering Outliers 217<br/>Permutation and Random Sampling 219<br/>Computing Indicator/Dummy Variables 221<br/>7.3 Extension Data Types 224<br/>7.4 String Manipulation 227<br/>Python Built-In String Object Methods 227<br/>Regular Expressions 229<br/>String Functions in pandas 232<br/>7.5 Categorical Data 235<br/>Background and Motivation 236<br/>Categorical Extension Type in pandas 237<br/>Computations with Categoricals 240<br/>Categorical Methods 242<br/>7.6 Conclusion 245<br/>8. Data Wrangling: Join, Combine, and Reshape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247<br/>8.1 Hierarchical Indexing 247<br/>Reordering and Sorting Levels 250<br/>Summary Statistics by Level 251<br/>Indexing with a DataFrame’s columns 252<br/>8.2 Combining and Merging Datasets 253<br/>Database-Style DataFrame Joins 254<br/>Merging on Index 259<br/>vi | Table of Contents<br/>Concatenating Along an Axis 263<br/>Combining Data with Overlap 268<br/>8.3 Reshaping and Pivoting 270<br/>Reshaping with Hierarchical Indexing 270<br/>Pivoting “Long” to “Wide” Format 273<br/>Pivoting “Wide” to “Long” Format 277<br/>8.4 Conclusion 279<br/>9. Plotting and Visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281<br/>9.1 A Brief matplotlib API Primer 282<br/>Figures and Subplots 283<br/>Colors, Markers, and Line Styles 288<br/>Ticks, Labels, and Legends 290<br/>Annotations and Drawing on a Subplot 294<br/>Saving Plots to File 296<br/>matplotlib Configuration 297<br/>9.2 Plotting with pandas and seaborn 298<br/>Line Plots 298<br/>Bar Plots 301<br/>Histograms and Density Plots 309<br/>Scatter or Point Plots 311<br/>Facet Grids and Categorical Data 314<br/>9.3 Other Python Visualization Tools 317<br/>9.4 Conclusion 317<br/>10. Data Aggregation and Group Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319<br/>10.1 How to Think About Group Operations 320<br/>Iterating over Groups 324<br/>Selecting a Column or Subset of Columns 326<br/>Grouping with Dictionaries and Series 327<br/>Grouping with Functions 328<br/>Grouping by Index Levels 328<br/>10.2 Data Aggregation 329<br/>Column-Wise and Multiple Function Application 331<br/>Returning Aggregated Data Without Row Indexes 335<br/>10.3 Apply: General split-apply-combine 335<br/>Suppressing the Group Keys 338<br/>Quantile and Bucket Analysis 338<br/>Example: Filling Missing Values with Group-Specific Values 340<br/>Example: Random Sampling and Permutation 343<br/>Example: Group Weighted Average and Correlation 344<br/>Table of Contents | vii<br/>Example: Group-Wise Linear Regression 347<br/>10.4 Group Transforms and “Unwrapped” GroupBys 347<br/>10.5 Pivot Tables and Cross-Tabulation 351<br/>Cross-Tabulations: Crosstab 354<br/>10.6 Conclusion 355<br/>11. Time Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357<br/>11.1 Date and Time Data Types and Tools 358<br/>Converting Between String and Datetime 359<br/>11.2 Time Series Basics 361<br/>Indexing, Selection, Subsetting 363<br/>Time Series with Duplicate Indices 365<br/>11.3 Date Ranges, Frequencies, and Shifting 366<br/>Generating Date Ranges 367<br/>Frequencies and Date Offsets 370<br/>Shifting (Leading and Lagging) Data 371<br/>11.4 Time Zone Handling 374<br/>Time Zone Localization and Conversion 375<br/>Operations with Time Zone-Aware Timestamp Objects 377<br/>Operations Between Different Time Zones 378<br/>11.5 Periods and Period Arithmetic 379<br/>Period Frequency Conversion 380<br/>Quarterly Period Frequencies 382<br/>Converting Timestamps to Periods (and Back) 384<br/>Creating a PeriodIndex from Arrays 385<br/>11.6 Resampling and Frequency Conversion 387<br/>Downsampling 388<br/>Upsampling and Interpolation 391<br/>Resampling with Periods 392<br/>Grouped Time Resampling 394<br/>11.7 Moving Window Functions 396<br/>Exponentially Weighted Functions 399<br/>Binary Moving Window Functions 401<br/>User-Defined Moving Window Functions 402<br/>11.8 Conclusion 403<br/>12. Introduction to Modeling Libraries in Python. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405<br/>12.1 Interfacing Between pandas and Model Code 405<br/>12.2 Creating Model Descriptions with Patsy 408<br/>Data Transformations in Patsy Formulas 410<br/>Categorical Data and Patsy 412<br/>viii | Table of Contents<br/>12.3 Introduction to statsmodels 415<br/>Estimating Linear Models 415<br/>Estimating Time Series Processes 419<br/>12.4 Introduction to scikit-learn 420<br/>12.5 Conclusion 423<br/>13. Data Analysis Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425<br/>13.1 Bitly Data from 1.USA.gov 425<br/>Counting Time Zones in Pure Python 426<br/>Counting Time Zones with pandas 428<br/>13.2 MovieLens 1M Dataset 435<br/>Measuring Rating Disagreement 439<br/>13.3 US Baby Names 1880–2010 443<br/>Analyzing Naming Trends 448<br/>13.4 USDA Food Database 457<br/>13.5 2012 Federal Election Commission Database 463<br/>Donation Statistics by Occupation and Employer 466<br/>Bucketing Donation Amounts 469<br/>Donation Statistics by State 471<br/>13.6 Conclusion 472<br/>A. Advanced NumPy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473<br/>A.1 ndarray Object Internals 473<br/>NumPy Data Type Hierarchy 474<br/>A.2 Advanced Array Manipulation 476<br/>Reshaping Arrays 476<br/>C Versus FORTRAN Order 478<br/>Concatenating and Splitting Arrays 479<br/>Repeating Elements: tile and repeat 481<br/>Fancy Indexing Equivalents: take and put 483<br/>A.3 Broadcasting 484<br/>Broadcasting over Other Axes 487<br/>Setting Array Values by Broadcasting 489<br/>A.4 Advanced ufunc Usage 490<br/>ufunc Instance Methods 490<br/>Writing New ufuncs in Python 493<br/>A.5 Structured and Record Arrays 493<br/>Nested Data Types and Multidimensional Fields 494<br/>Why Use Structured Arrays? 495<br/>A.6 More About Sorting 495<br/>Indirect Sorts: argsort and lexsort 497<br/>Table of Contents | ix<br/>Alternative Sort Algorithms 498<br/>Partially Sorting Arrays 499<br/>numpy.searchsorted: Finding Elements in a Sorted Array 500<br/>A.7 Writing Fast NumPy Functions with Numba 501<br/>Creating Custom numpy.ufunc Objects with Numba 502<br/>A.8 Advanced Array Input and Output 503<br/>Memory-Mapped Files 503<br/>HDF5 and Other Array Storage Options 504<br/>A.9 Performance Tips 505<br/>The Importance of Contiguous Memory 505<br/>B. More on the IPython System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509<br/>B.1 Terminal Keyboard Shortcuts 509<br/>B.2 About Magic Commands 510<br/>The %run Command 512<br/>Executing Code from the Clipboard 513<br/>B.3 Using the Command History 514<br/>Searching and Reusing the Command History 514<br/>Input and Output Variables 515<br/>B.4 Interacting with the Operating System 516<br/>Shell Commands and Aliases 517<br/>Directory Bookmark System 518<br/>B.5 Software Development Tools 519<br/>Interactive Debugger 519<br/>Timing Code: %time and %timeit 523<br/>Basic Profiling: %prun and %run -p 525<br/>Profiling a Function Line by Line 527<br/>B.6 Tips for Productive Code Development Using IPython 529<br/>Reloading Module Dependencies 529<br/>Code Design Tips 530<br/>B.7 Advanced IPython Features 532<br/>Profiles and Configuration 532<br/>B.8 Conclusion 533<br/>Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
520 ## - SUMMARY, ETC.
Summary, etc "Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, IPython, and Jupyter in the process"--Page 4 of cover.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Python (Computer program language)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Programming languages (Electronic computers)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Data mining.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Data analysis
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Data mining.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Subject Python (Computer program language)
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Books
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Home library Current library Shelving location Date acquired Source of acquisition Cost, normal purchase price Bill Date Full call number Accession Number Price effective from Koha item type
    Dewey Decimal Classification     Institute of Public Enterprise, Library Institute of Public Enterprise, Library S Campus 08/11/2023 Professional Book Services 1800.00 04-08-2023 005.133 MCK 47679 08/11/2023 Books
    Dewey Decimal Classification     Institute of Public Enterprise, Library Institute of Public Enterprise, Library S Campus 08/11/2023 Professional Book Services 1800.00 04-08-2023 005.133 MCK 47680 08/11/2023 Books

Maintained and Designed by
2cqr automation private limited, Chennai. All Rights Reserved.

You are Visitor Number

PHP Hits Count