000 -LEADER |
fixed length control field |
12611nam a22002417a 4500 |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
ISBN |
9789355421906 |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
005.133 |
Item number |
MCK |
100 1# - MAIN ENTRY--AUTHOR NAME |
Author name |
McKinney, Wes, |
245 10 - TITLE STATEMENT |
Title |
Python for data analysis : |
Sub Title |
data wrangling with Pandas, NumPy, and Jupyter / |
Statement of responsibility, etc |
Wes McKinney. |
250 ## - EDITION STATEMENT |
Edition statement |
3rd ed. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) |
Place of publication |
Beijing : |
Name of publisher |
O'Reilly Media , |
Year of publication |
2022. |
300 ## - PHYSICAL DESCRIPTION |
Number of Pages |
xvi, 561 pages : |
Other physical details |
illustrations ; |
Dimensions |
24 cm |
500 ## - GENERAL NOTE |
General note |
Includes index. |
505 0# - FORMATTED CONTENTS NOTE |
Formatted contents note |
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi<br/>1. Preliminaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.1 What Is This Book About? 1<br/>What Kinds of Data? 1<br/>1.2 Why Python for Data Analysis? 2<br/>Python as Glue 3<br/>Solving the “Two-Language” Problem 3<br/>Why Not Python? 3<br/>1.3 Essential Python Libraries 4<br/>NumPy 4<br/>pandas 5<br/>matplotlib 6<br/>IPython and Jupyter 6<br/>SciPy 7<br/>scikit-learn 8<br/>statsmodels 8<br/>Other Packages 9<br/>1.4 Installation and Setup 9<br/>Miniconda on Windows 9<br/>GNU/Linux 10<br/>Miniconda on macOS 11<br/>Installing Necessary Packages 11<br/>Integrated Development Environments and Text Editors 12<br/>1.5 Community and Conferences 13<br/>1.6 Navigating This Book 14<br/>Code Examples 15<br/>iii<br/>Data for Examples 15<br/>Import Conventions 16<br/>2. Python Language Basics, IPython, and Jupyter Notebooks. . . . . . . . . . . . . . . . . . . . . . . . 17<br/>2.1 The Python Interpreter 18<br/>2.2 IPython Basics 19<br/>Running the IPython Shell 19<br/>Running the Jupyter Notebook 20<br/>Tab Completion 23<br/>Introspection 25<br/>2.3 Python Language Basics 26<br/>Language Semantics 26<br/>Scalar Types 34<br/>Control Flow 42<br/>2.4 Conclusion 45<br/>3. Built-In Data Structures, Functions, and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br/>3.1 Data Structures and Sequences 47<br/>Tuple 47<br/>List 51<br/>Dictionary 55<br/>Set 59<br/>Built-In Sequence Functions 62<br/>List, Set, and Dictionary Comprehensions 63<br/>3.2 Functions 65<br/>Namespaces, Scope, and Local Functions 67<br/>Returning Multiple Values 68<br/>Functions Are Objects 69<br/>Anonymous (Lambda) Functions 70<br/>Generators 71<br/>Errors and Exception Handling 74<br/>3.3 Files and the Operating System 76<br/>Bytes and Unicode with Files 80<br/>3.4 Conclusion 82<br/>4. NumPy Basics: Arrays and Vectorized Computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br/>4.1 The NumPy ndarray: A Multidimensional Array Object 85<br/>Creating ndarrays 86<br/>Data Types for ndarrays 88<br/>Arithmetic with NumPy Arrays 91<br/>Basic Indexing and Slicing 92<br/>iv | Table of Contents<br/>Boolean Indexing 97<br/>Fancy Indexing 100<br/>Transposing Arrays and Swapping Axes 102<br/>4.2 Pseudorandom Number Generation 103<br/>4.3 Universal Functions: Fast Element-Wise Array Functions 105<br/>4.4 Array-Oriented Programming with Arrays 108<br/>Expressing Conditional Logic as Array Operations 110<br/>Mathematical and Statistical Methods 111<br/>Methods for Boolean Arrays 113<br/>Sorting 114<br/>Unique and Other Set Logic 115<br/>4.5 File Input and Output with Arrays 116<br/>4.6 Linear Algebra 116<br/>4.7 Example: Random Walks 118<br/>Simulating Many Random Walks at Once 120<br/>4.8 Conclusion 121<br/>5. Getting Started with pandas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123<br/>5.1 Introduction to pandas Data Structures 124<br/>Series 124<br/>DataFrame 129<br/>Index Objects 136<br/>5.2 Essential Functionality 138<br/>Reindexing 138<br/>Dropping Entries from an Axis 141<br/>Indexing, Selection, and Filtering 142<br/>Arithmetic and Data Alignment 152<br/>Function Application and Mapping 158<br/>Sorting and Ranking 160<br/>Axis Indexes with Duplicate Labels 164<br/>5.3 Summarizing and Computing Descriptive Statistics 165<br/>Correlation and Covariance 168<br/>Unique Values, Value Counts, and Membership 170<br/>5.4 Conclusion 173<br/>6. Data Loading, Storage, and File Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br/>6.1 Reading and Writing Data in Text Format 175<br/>Reading Text Files in Pieces 182<br/>Writing Data to Text Format 184<br/>Working with Other Delimited Formats 185<br/>JSON Data 187<br/>Table of Contents | v<br/>XML and HTML: Web Scraping 189<br/>6.2 Binary Data Formats 193<br/>Reading Microsoft Excel Files 194<br/>Using HDF5 Format 195<br/>6.3 Interacting with Web APIs 197<br/>6.4 Interacting with Databases 199<br/>6.5 Conclusion 201<br/>7. Data Cleaning and Preparation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203<br/>7.1 Handling Missing Data 203<br/>Filtering Out Missing Data 205<br/>Filling In Missing Data 207<br/>7.2 Data Transformation 209<br/>Removing Duplicates 209<br/>Transforming Data Using a Function or Mapping 211<br/>Replacing Values 212<br/>Renaming Axis Indexes 214<br/>Discretization and Binning 215<br/>Detecting and Filtering Outliers 217<br/>Permutation and Random Sampling 219<br/>Computing Indicator/Dummy Variables 221<br/>7.3 Extension Data Types 224<br/>7.4 String Manipulation 227<br/>Python Built-In String Object Methods 227<br/>Regular Expressions 229<br/>String Functions in pandas 232<br/>7.5 Categorical Data 235<br/>Background and Motivation 236<br/>Categorical Extension Type in pandas 237<br/>Computations with Categoricals 240<br/>Categorical Methods 242<br/>7.6 Conclusion 245<br/>8. Data Wrangling: Join, Combine, and Reshape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247<br/>8.1 Hierarchical Indexing 247<br/>Reordering and Sorting Levels 250<br/>Summary Statistics by Level 251<br/>Indexing with a DataFrame’s columns 252<br/>8.2 Combining and Merging Datasets 253<br/>Database-Style DataFrame Joins 254<br/>Merging on Index 259<br/>vi | Table of Contents<br/>Concatenating Along an Axis 263<br/>Combining Data with Overlap 268<br/>8.3 Reshaping and Pivoting 270<br/>Reshaping with Hierarchical Indexing 270<br/>Pivoting “Long” to “Wide” Format 273<br/>Pivoting “Wide” to “Long” Format 277<br/>8.4 Conclusion 279<br/>9. Plotting and Visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281<br/>9.1 A Brief matplotlib API Primer 282<br/>Figures and Subplots 283<br/>Colors, Markers, and Line Styles 288<br/>Ticks, Labels, and Legends 290<br/>Annotations and Drawing on a Subplot 294<br/>Saving Plots to File 296<br/>matplotlib Configuration 297<br/>9.2 Plotting with pandas and seaborn 298<br/>Line Plots 298<br/>Bar Plots 301<br/>Histograms and Density Plots 309<br/>Scatter or Point Plots 311<br/>Facet Grids and Categorical Data 314<br/>9.3 Other Python Visualization Tools 317<br/>9.4 Conclusion 317<br/>10. Data Aggregation and Group Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319<br/>10.1 How to Think About Group Operations 320<br/>Iterating over Groups 324<br/>Selecting a Column or Subset of Columns 326<br/>Grouping with Dictionaries and Series 327<br/>Grouping with Functions 328<br/>Grouping by Index Levels 328<br/>10.2 Data Aggregation 329<br/>Column-Wise and Multiple Function Application 331<br/>Returning Aggregated Data Without Row Indexes 335<br/>10.3 Apply: General split-apply-combine 335<br/>Suppressing the Group Keys 338<br/>Quantile and Bucket Analysis 338<br/>Example: Filling Missing Values with Group-Specific Values 340<br/>Example: Random Sampling and Permutation 343<br/>Example: Group Weighted Average and Correlation 344<br/>Table of Contents | vii<br/>Example: Group-Wise Linear Regression 347<br/>10.4 Group Transforms and “Unwrapped” GroupBys 347<br/>10.5 Pivot Tables and Cross-Tabulation 351<br/>Cross-Tabulations: Crosstab 354<br/>10.6 Conclusion 355<br/>11. Time Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357<br/>11.1 Date and Time Data Types and Tools 358<br/>Converting Between String and Datetime 359<br/>11.2 Time Series Basics 361<br/>Indexing, Selection, Subsetting 363<br/>Time Series with Duplicate Indices 365<br/>11.3 Date Ranges, Frequencies, and Shifting 366<br/>Generating Date Ranges 367<br/>Frequencies and Date Offsets 370<br/>Shifting (Leading and Lagging) Data 371<br/>11.4 Time Zone Handling 374<br/>Time Zone Localization and Conversion 375<br/>Operations with Time Zone-Aware Timestamp Objects 377<br/>Operations Between Different Time Zones 378<br/>11.5 Periods and Period Arithmetic 379<br/>Period Frequency Conversion 380<br/>Quarterly Period Frequencies 382<br/>Converting Timestamps to Periods (and Back) 384<br/>Creating a PeriodIndex from Arrays 385<br/>11.6 Resampling and Frequency Conversion 387<br/>Downsampling 388<br/>Upsampling and Interpolation 391<br/>Resampling with Periods 392<br/>Grouped Time Resampling 394<br/>11.7 Moving Window Functions 396<br/>Exponentially Weighted Functions 399<br/>Binary Moving Window Functions 401<br/>User-Defined Moving Window Functions 402<br/>11.8 Conclusion 403<br/>12. Introduction to Modeling Libraries in Python. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405<br/>12.1 Interfacing Between pandas and Model Code 405<br/>12.2 Creating Model Descriptions with Patsy 408<br/>Data Transformations in Patsy Formulas 410<br/>Categorical Data and Patsy 412<br/>viii | Table of Contents<br/>12.3 Introduction to statsmodels 415<br/>Estimating Linear Models 415<br/>Estimating Time Series Processes 419<br/>12.4 Introduction to scikit-learn 420<br/>12.5 Conclusion 423<br/>13. Data Analysis Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425<br/>13.1 Bitly Data from 1.USA.gov 425<br/>Counting Time Zones in Pure Python 426<br/>Counting Time Zones with pandas 428<br/>13.2 MovieLens 1M Dataset 435<br/>Measuring Rating Disagreement 439<br/>13.3 US Baby Names 1880–2010 443<br/>Analyzing Naming Trends 448<br/>13.4 USDA Food Database 457<br/>13.5 2012 Federal Election Commission Database 463<br/>Donation Statistics by Occupation and Employer 466<br/>Bucketing Donation Amounts 469<br/>Donation Statistics by State 471<br/>13.6 Conclusion 472<br/>A. Advanced NumPy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473<br/>A.1 ndarray Object Internals 473<br/>NumPy Data Type Hierarchy 474<br/>A.2 Advanced Array Manipulation 476<br/>Reshaping Arrays 476<br/>C Versus FORTRAN Order 478<br/>Concatenating and Splitting Arrays 479<br/>Repeating Elements: tile and repeat 481<br/>Fancy Indexing Equivalents: take and put 483<br/>A.3 Broadcasting 484<br/>Broadcasting over Other Axes 487<br/>Setting Array Values by Broadcasting 489<br/>A.4 Advanced ufunc Usage 490<br/>ufunc Instance Methods 490<br/>Writing New ufuncs in Python 493<br/>A.5 Structured and Record Arrays 493<br/>Nested Data Types and Multidimensional Fields 494<br/>Why Use Structured Arrays? 495<br/>A.6 More About Sorting 495<br/>Indirect Sorts: argsort and lexsort 497<br/>Table of Contents | ix<br/>Alternative Sort Algorithms 498<br/>Partially Sorting Arrays 499<br/>numpy.searchsorted: Finding Elements in a Sorted Array 500<br/>A.7 Writing Fast NumPy Functions with Numba 501<br/>Creating Custom numpy.ufunc Objects with Numba 502<br/>A.8 Advanced Array Input and Output 503<br/>Memory-Mapped Files 503<br/>HDF5 and Other Array Storage Options 504<br/>A.9 Performance Tips 505<br/>The Importance of Contiguous Memory 505<br/>B. More on the IPython System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509<br/>B.1 Terminal Keyboard Shortcuts 509<br/>B.2 About Magic Commands 510<br/>The %run Command 512<br/>Executing Code from the Clipboard 513<br/>B.3 Using the Command History 514<br/>Searching and Reusing the Command History 514<br/>Input and Output Variables 515<br/>B.4 Interacting with the Operating System 516<br/>Shell Commands and Aliases 517<br/>Directory Bookmark System 518<br/>B.5 Software Development Tools 519<br/>Interactive Debugger 519<br/>Timing Code: %time and %timeit 523<br/>Basic Profiling: %prun and %run -p 525<br/>Profiling a Function Line by Line 527<br/>B.6 Tips for Productive Code Development Using IPython 529<br/>Reloading Module Dependencies 529<br/>Code Design Tips 530<br/>B.7 Advanced IPython Features 532<br/>Profiles and Configuration 532<br/>B.8 Conclusion 533<br/>Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 |
520 ## - SUMMARY, ETC. |
Summary, etc |
"Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, IPython, and Jupyter in the process"--Page 4 of cover. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Python (Computer program language) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Programming languages (Electronic computers) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Data mining. |
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Data analysis |
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Data mining. |
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM |
Subject |
Python (Computer program language) |
942 ## - ADDED ENTRY ELEMENTS (KOHA) |
Source of classification or shelving scheme |
Dewey Decimal Classification |
Koha item type |
Books |