Learning Python

There is a wealth of resources available with which to learn Python.  One site that provides a very easy introduction to the language is CodeCademy. Stackify.com lists the ‘top 30 Python tutorials,’ so that is a good place to start looking for a resource that best suites your mode of learning. Python.org also has a list of resources for people who have never programmed before that looks like it has some very helpful guides. Unfortunately there are no Python tutorials from the Khan Academy but the Google for Education Python class does have videos associated with much of its content, so this site might be a good one for those who learn best by watching videos.

One thing to be conscious of–when working in Notebooks there are two types of cells, code cells and markdown cells. The syntax used in each type of cell is very different, something you will eventually get used to.  For example, in a code cell the pound symbol (also known as the number symbol, or, in Python, the hash symbol) ( # ) is used to identify a comment, while in a markdown cell the pound sign is used to define a title (bolded and larger font size).

A general rule in programming is that you can never have enough comments in a program.  Often you can write some code and then, later, you can’t figure out what the code is actually doing! This is not an atypical experience, so experienced programmers know that including many comments is the first step in becoming an efficient programmer.

Every programming language has a basic set of rules on how to format the code (i.e., syntax), and Python is no exception.   Some of the important rules to note include:

  • Case is important–that is, Variable is different from variable, so make sure that always use the same case when writing code (and, in Python, generally you are encouraged to use lower case).
    • A variable is simply a name that refers to, or represents, a value. For example:
      x = 9
      x is the variable and 9 would be its value.
  • Don’t use punctuation characters such as @, $, ! and % in variable names. Do not start a variable name with a number.
  • Use longer variable names rather than shorter ones, since that will help you follow your code. For example, a variable named ‘count’ will make more sense than one name ‘c’.
  • More generally within Python, the term ‘variable‘ belongs to a larger group of entities including functions and classes.  Together these three terms are types of Identifiers in Python.
  • Variables can contain (represent) single numbers, integer or real (e.g., x = 9 or y = 9.0) or text (strings) such as x = ‘nine’.  Formally, these–integer, real, or text–are known as ‘data types‘ within Python (aka ‘dtype’, of which there are many more than just these three)
  • Variables can also contain (represent) lists or arrays (tables) of numbers or strings (or a combination of both). For example, x = [1, 3, 5, 7, 9] would assign the list of numbers (1, 3, 5, 7, 9) to the variable x.  Notice the use of the square brackets ( [ ] ) to indicate that the enclosed values are to be assigned to the variable ( x ).  Formally, these–lists (aka series) or arrays–are also known as ‘data structures‘ within Python.
    Here you can find a relatively simple overview of how Python handles variables (numbers, strings, and lists).
    Pandas, a Python library, is often used to handle tables. (See more below on Pandas.)
  • Strings (or text) in Python are delimited either by enclosing the entity by single quotes ( ‘string’ ) or double quotes ( “string” ).
  • The backslash ( \ ) is a special character in Python–it is called an escape character.  If a backslash precedes what otherwise would be an ‘illegal’ character in a string, that character will be permitted to appear (that is, the backslash ‘escapes’ the character from it ‘true’ meaning).  This means that to show a backslash in a text string you need to ‘escape’ it by using a double-backslash (i.e., ‘\\’  will be read/displayed by Python  as ‘\’ )
  • Indentation is VERY important in Python code. Unlike most other programming languages that use semi-colons ( ; ) or curly brackets ( {} ) to define blocks of code (or a function) or the end of a line of code, Python uses indentation to define a block of code (or a function) and a simple line return to indicate the end of a line of code (although there are exceptions to this rule). What this means is:
    • do not indent lines of code unless it is required. That is, if you are creating a ‘block’ of code that belongs together, such as a for loop or a function, you must indent it:
      for <variable> in <sequence>:  
         # body_of_loop that has set of statements
         # which requires repeated execution
      
    • do not use the tab key to indent text; the preferred style is to use four spaces;
    • basic rule: if the line of code ends in a colon ( : ), you will need to indent the subsequent line(s) of code.
  • There are ‘reserved’ words that cannot be used as variable names or any other identifier name, such as: and, else, not, for, or, while, with, since they have intrinsic meanings in Python. You can find a list of the reserved words here.
  • Unlike most other programming languages, Python does not require you to declare whether a variable will represent an integer, a real number, or text; the ‘declaration’ is done upon assignment.  What does this mean? It means that you can simply type in x = 9 and Python will then know that x represents an integer.  You could subsequently type in x = ‘nine’ and Python would now associate x with the text string ‘nine’.

    In a programming language such as C++ you would first have to tell the program that x will represent an integer (e.g., INT x) and then make the assignment (x = 9). This makes writing code far easier for you (and, since you don’t need to worry about making a declaration, you can forget about this rule now that you know it!).

  • If you make a mistake in writing the code and attempt to ‘run’ the code, Python will tell you, although often the error message isn’t all that informative to new programmers.
  • Python assumes that the first element of a list is the ‘0’ element; the second element would be the ‘1’ element, etc. This can be a bit confusing since a list can contain, as above, 5 numbers [1, 3, 5, 7, 9] but the last number is considered to be the 4th element of the list (since the first element is considered to be the 0 element).
  • Columns in a table can be referred to either with a number, as in the fourth column would be column ‘3’ (since the first column is column 0), or with a name. This is similar to the way that a column in Excel can have a name assigned to it, or you can refer to the column identifier (such as A).  Similarly you can also refer to rows by using the row number or, if present, the row name. When referring to tables use (rows, columns) or (index, col), or (axis =0) to refer to the rows and (axis=1) to refer to the columns.
  • Python uses the same sets of operators as most other computer languages.  Thus you can add numbers (+), divide (/), assign values (x = 9), compare values ( x == y [is x equal to y]) and much more.
  • Functions and Methods are important elements of the Python language. Functions do things like tell you the min value in a list [ min() ], the max value [ max() ], the length of a list [ len() ], round a number [ round() ], sum a list of numbers [ sum() ].
  • Methods work with strings and lists (and more), and can do things like Capitalize the first letter of a string [ capitalize() ], convert all characters to lower case [ lower() ], an tell you if the string is composed only of digits [ isdigit() ].

Libraries are important add-ons that provide additional functionalities to Python.  Hundreds of Python libraries have been developed in order to make it easy to do image processing, scientific computing (Scipy), machine learning (Scikit Learn), database (SQL) manipulation, table/array manipulation (Pandas and NumPy), create scientific graphs (Matplotlib) and data visualization (Bokeh), and much more. Here is a list of 30 useful Python libraries and packages for beginners that includes links to the libraries mentioned above.

If you look at the Python tab in ArcGIS Pro (Project > Python) you will see that ESRI has included over 100 different Python packages into the ArcGIS Pro Notebook! Packages added include Scipy, Pandas, NumPy, and Matplotlib.  All of these libraries add considerable functionality to ArcGIS Pro Notebooks, but, of course, they all have their own terms (e.g., methods) and conditions and therefore require more learning. In order to make use of the functionality of a library you first need to import it into the Notebook environment. You will notice in the Notebook for Lab 1 that the first set of Python instructions are:

import pandas as pd
import arcpy as ap
import numpy as np

Those commands illustrate two things–one, the importation of three libraries, and two, the renaming of the libraries to something easier to type (pd instead of pandas, ap instead of arcpy, and np instead of numpy).  The reason for the name simplification is that in order to use any of the commands specific to a library you need to explicitly make reference to the library by prepending the library name to the command (e.g., pd.read_csv, which indicates that Python is to look into the pandas library for the command read_csv).