{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Two of the central building blocks of the EarthAI Notebook environment are Jupyter Notebooks and the Python programming language. This article will guide the Python programming novice through some of the most basics context and syntax.\n", "\n", "*Note: if you would like to follow along with the code blocks in this tutorial, you can download the companion notebook attached at the end of this article.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python is a widely used programming language in data science, machine learning and artificial intelligence, data visualization, web development and a wide range of other use cases. The \"Swiss Army Knife\" aspect is part of the reason for its widespread use in many fields. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python is an interpreted language. Unlike compiled languages, such as C++, interpreted languages do not first go through a compilation step before they are executed. Python is executed more \"on the fly\" through its interpreter. Compiled versus interpreted is beyond the scope of this crash course in Python, but the interpreted nature of Python lends itself nicely to the Notebook environment where experimenting with code and quickly seeing the results is a central design trait." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Primitive Data Types and Variable Assignment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like other programming languages, Python relies on a few primitive data types to build up more complex expressions and data structures. First, there are integers or whole numbers, such as 1, 2, and 5413. Next, there are floating point numbers that are often termed floats, such as 3.14 and -45.1283. Essentially, these are decimal numbers. A string is a sequence of alphanumeric characters surrounded by single or double quotes, such as \"Hello\". Finally, there is a boolean data type that is either True or False and results from evaluation of comparison statements like `1 > 2`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Variable assignment in Python is straightforward. Unlike \"typed\" languages, such as C, where you have to declare the variable data type when creating it, Python instead infers the type from the type of the assigned value. Variables can be assigned to any data type or data structure." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value = 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After running the code cell above, the variable **my_value** is a container that holds the value of 2. You can now refer to the variable name in place of the value in other expressions and statements throughout your code. We said that a type need not be declared when creating a variable. So we can check the type of **my_value** which should be of type int since 2 in an integer:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(my_value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Checking the data type of the variable **my_value** using the `type` function does indeed return a type int. We can do anything with **my_value** that we could do with the number 2 in Python as **my_value** is a reference to 2." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value * 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value * my_value * my_value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can reassign the variable **my_value** to a new value that need not be the same data type." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value = 3.14" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(my_value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reassigning **my_value** to 3.14 not only changes the value of the variable, but also the type. We can also reassign **my_value** using **my_value** itself in an expression:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_value = my_value + 1\n", "my_value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "String data types can also be assigned to variables:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_string = 'Astraea'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(my_string)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using single or double quotes is equally valid in constructing a string, but note that you cannot mix them or an error will be thrown:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_num_string = \"1234\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can perform a few math-like operations on strings that combine or concatenate strings:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_string * 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Multiplying a string by a integer essentially copies the string a number of times. One of the most common string operations is adding or concatenating two or more strings together." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_string + ' ' + my_num_string" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the case above we added a blank space character string. Note however that strings cannot be combined with other data types." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combining a string and an integer throws a TypeError. But what about combining a float and an integer?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3 + 4.3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This evaluates to a float. So under the hood the integer 3 is type converted to a float. The result of combining a float and an integer will always result in a float. There are functions for explicit type conversion if you need them. Below the float 4.3 is converted to an int." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(4.3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "int(5.8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see type converting between a float and integer is not a rounding process. The decimal portion is simply truncated." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "float('3')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(float('3'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings can also be converted to floats and integers or vice-versa." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, there is the Boolean data type. This is either True or False upon evaluation of a logical expression." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A double equal sign evaluates equality whereas a single equal sign is used for assignment.\n", "1 == 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So the above evaluates to True as 1 does indeed equal 1. We've also added a comment into a code cell. This is common practice in programming and in Python is accomplished by preceding text intended for comments with a `#`. Multi-line comments in a code cell begin and end with three backticks:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "'''This is a long multiline comment.\n", " This shows you how to add a lot of text\n", " to a code cell.'''\n", "\n", "1 == 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also check the equivalence of strings." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "'Dog' == 'Dog'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "'Dog' == 'Cat'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also evaluate compound logicals. So True and False expressions evaluate to False whereas True or False expressions evaluate to True. These are just a couple of examples of how Boolean expressions in Python can be used." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(3 > 2) & (4 > 5) # The & stands for the \"And\" comparison." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(3 > 2) | (4 > 5) # The | stands for the \"Or\" comparison." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Data Structures" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The primitive data types in Python can be combined into more complex data collections.\n", "\n", "## Lists\n", "\n", "Perhaps the most common Python data structure is the *list*. A Python list is just a collection of either homogenous or heterogeneous data types. A list is constructed with square brackets `[]` with list elements separated by commas." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list = [1, 1, 2, 3, 5, 8] # homogeneous data type list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(my_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A list in Python is of \"list\" type and can be assigned to a variable." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_mixed_list = [1, 'hi', 3.4, True, 42] # heterogenous data type list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_mixed_list * 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like strings lists can be concatenated." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_combined_list = my_list + my_mixed_list\n", "my_combined_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the most powerful operations in a list is accessing specific elements of that list. This is done by referencing the index or location of the list element. One note on Python is that indexing starts at 0. So the first element in a list is at index 0. You simply access the element by passing the list name followed by the index in square brackets." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list[0] # this returns the first element of my_list." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can check the length of a list. We see that there are six elements in **my_list**." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(my_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can select multiple elements by use of a colon. The first index is inclusive the last is not. So if you wanted to select the first three elements of **my_list** you would use `my_list[0:3]` to obtain element indices 0, 1, 2." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list[0:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings also behave as lists in the case of indexing. String elements can be accessed by index or position as lists are." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_string" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_string[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But an important difference between strings and lists is that strings are immutable. Elements in strings cannot be reassigned whereas they can in lists." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we wanted to change the second element in **my_list** from 1 to 'Earth' would simply reassign that element." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list[1] = 'Earth'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also delete and append elements to a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list = [1, 2, 3, 4]\n", "my_list.pop(1) # This removes the index 1 element from the list.\n", "print(my_list)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list.append('hi') # This appends the string 'hi' to the end of the list.\n", "print(my_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, lists can be nested in other lists. There is no practical limit to the amount of nesting that can happen in a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_new_list = [1, 'hi', [1, 2, 3, 4], 9.87]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_new_list[2] # the third element of \"my_new_list\" is a list." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do we access the second element in the third element of **my_new_list**?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_new_list[2][1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tuples\n", "\n", "Tuples are data collections that are similar to lists except that like strings they are immutable. Once a tuple is created its elements cannot be changed. Tuples are formed with `()` and elements separated by commas. Tuples are used to store data that should not change during execution of a Python script." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_tuple = (1, 2, 3, 4, 5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_tuple[0] # Tuple elements are accessed as those in lists are." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionaries\n", "\n", "The final data collection type is a dictionary. Dictionaries are constructed with key-value pairs within braces `{}`. For lists and tuples elements are accessed by their index. Values in dictionaries are accessed by their associated key." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores = {'Fred': 65, 'Sally': 98, 'Sven': 87, 'Jeb': 90}\n", "type(my_scores)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores['Fred'] # If we want Fred's score we just use the 'Fred' key." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As with lists we can change (or mutate) values or add new key-value pairs to an existing dictionary." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores['Fred'] = 70\n", "my_scores" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores['Amanda'] = 95\n", "my_scores" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores['Tim'] = 'did not show up' # we can also mix data types in the values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_scores" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Functions and Methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This last section covers functions and methods. Functions are reusable blocks of code. Methods are basically functions that are associated with specific object types in Python.\n", "\n", "We've already seen functions in use when we ran `type(my_list)` above. We were using the `type` function. Likewise, when we ran `my_list.append('hi')` we were using the `append` method on the object **my_list**.\n", "\n", "Objects can have specific methods associated with them. For instance, the object type string has a method `toUpper()` that capitalizes the first element in a string. The float object does not have this method. When creating classes of objects the user can define methods of their own to act on those objects." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions\n", "\n", "First, we'll write a simple function that prints \"Hello, Astraea!\" when it is called and then call that function in the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def my_function():\n", " print('Hello, Astraea!')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_function()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All Python functions have the same structure. There is first the `def` followed by whatever function name you want to give it followed by parenthesis. The function parameters go inside the parenthesis. In this simple function there are none. A colon follows the parentheses to indicate the start of the function definition.\n", "\n", "Unlike other languages, whitespace before lines of code that are part of the function play a critical role in Python. Whitespace is automatically added when you hit ***Return*** after the function declaration in a Jupyter Notebook, but if you were working in a code editor that did not automatically do that you need to add whitespace manually, otherwise you get a syntax error. \n", "Once the function is defined it can be used at any time by simply calling it. Functions are first-class citizens in Python. That is you can assign them to variables and pass them to other functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_variable = my_function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_variable()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's write a slightly more interesting function that requires an input parameter. We'll write a function that sums the elements for a list that is passed to the function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def listSum(list):\n", " sum = 0\n", " for item in list:\n", " sum += item\n", " return(sum)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_list = [1, 2, 3, 4,]\n", "listSum(my_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above function is called `listSum` and takes one parameter we've chosen to name `list`. We could've called the parameter anything but it is recommended to make parameter names descriptive. The first thing we do within the function is define a variable called **sum** and set it to 0. Basically, we need to initialize our summing variable. Now we run a for loop over all of the items in list and add them to the running sum. Again we could have named **item** anything as long as we're consistent throughout the function code block. The final step is we return the sum at the end of the loop.\n", "\n", "We then construct a list called **my_list** and call `listSum` passing **my_list** as the function parameter. The function returns the expected value of 10, which is the sum of the **my_list** elements." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importing Libraries\n", "\n", "Functions are critical in Python as they provide reusable code blocks that eliminates the need for repetitive coding. However, in EarthAI workflows and indeed in many Python workflows most needed functions will have already been written and accessible through a rich collection of Python libraries. Accessing these libraries and their functions is as simple as importing the libraries into your Notebook workspace once they are installed. One important benefit of the EarthAI notebook environment is most of the critical libraries for Earth Observation analyses come pre-installed, so you need only import the library as needed. Once a library is imported it is available throughout the Notebook session." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code cell below imports a library called Folium that is used to build interactive maps. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import folium as f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now have access to all of the functions or methods in the Folium library. Oftentimes libraries will be imported with an alias to simplify coding. Basically, just less typing. In this case we've aliased folium as \"f\". The functions available in a library can be seen by simply typing the library name followed by a `.` and ***Tab*** in a code cell. This dot notation is common where the library name is followed by a dot to access the functions and methods for the library.\n", "\n", "Below we'll create as simple interactive map using the `Map` function in Folium. The single input parameter is a location variable with latitude and longitude coordinates. However, there are many more input parameters underneath the hood. When typing `f.Map(` after the open parenthesis hitting ***Shift+Tab*** pulls up a dialogue box listing all of the possible input parameters to the function as well as more information about the function behavior. The other parameters are automatically set to default values if omitted in the function call. Also, we can assign the output of the function to a variable. You now have a fully interactive map in your Notebook!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m = f.Map(location = (45.5236, -122.6750))\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You could have written your own `Map` function from scratch, but utilizing existing libraries and functions is a key component to streamlining Python workflows. The EarthAI Notebook environment has been designed to make many of the necessary Python libraries for Earth observation workflows easily accessible to the user. This includes the earthai library designed specifically for acquiring and analyzing Earth observation data through the Astraea platform." ] } ], "metadata": { "kernelspec": { "display_name": "EarthAI Environment", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" }, "zendesk": { "draft": true, "id": 360051309571, "section_id": 360008608152, "title": "Python Primer" } }, "nbformat": 4, "nbformat_minor": 4 }