.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/metro_regression.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_metro_regression.py: Metro Regression. ================ Data imputing example in a Santiago Metro station. Compute the number of exits in a random metro station. Then, compute the regression over the whole network and plot the error of the regression. Here, Tikhonov regression is used. Lastly, the average of neighboring nodes is also used to compare the error of the regression. To run this example, you need to download three files and place them in the same directory as this script. 1. Download the file `Tablas de subidas y bajadas nov23.zip` from this link: https://www.dtpm.cl/descargas/modelos_y_matrices/Tablas%20de%20subidas%20y%20bajadas%20nov23.zip Then, uncompress the zip file and copy `2023.11 Matriz_baj_SS_MH.xlsb` to the same location as this script. 2. Download the file `santiago_metro_stations_coords.geojson` from this link: https://zenodo.org/records/11637462/files/santiago_metro_stations_coords.geojson 3. Download the file `santiago_metro_stations_connections.txt` from this link: https://zenodo.org/records/11637462/files/santiago_metro_stations_connections.txt .. GENERATED FROM PYTHON SOURCE LINES 30-31 .. code-block:: Python import os .. GENERATED FROM PYTHON SOURCE LINES 32-46 .. code-block:: Python import sys import matplotlib.pyplot as plt import networkx as nx import numpy as np import pandas as pd from unidecode import unidecode from pygsp2 import graphs, learning from pygsp2.utils_examples import (fetch_data, make_metro_graph, metro_database_preprocessing, plot_signal_in_graph) current_dir = os.getcwd() os.chdir(current_dir) .. GENERATED FROM PYTHON SOURCE LINES 47-59 .. code-block:: Python assets_dir = os.path.join(current_dir, 'data') fetch_data(assets_dir, 'metro') try: commutes = pd.read_excel(os.path.join(assets_dir, '2023.11 Matriz_baj_SS_MH.xlsb'), header=1, sheet_name='bajadas_prom_laboral') except FileNotFoundError: print(f'Data file was not found in:\n {os.getcwd()}') print('Download it from:\n' + r'https://www.dtpm.cl/descargas/modelos_y_matrices/Tablas%20de%20subidas%20y%20bajadas%20nov23.zip') sys.exit(1) .. GENERATED FROM PYTHON SOURCE LINES 60-77 .. code-block:: Python G, pos = make_metro_graph(edgesfile=os.path.join(assets_dir, 'santiago_metro_stations_connections.txt'), coordsfile=os.path.join(assets_dir, 'santiago_metro_stations_coords.geojson')) pos_list = [(G.nodes[node]['y'], G.nodes[node]['x']) for node in G.nodes] # Extract adjacency matrix W = nx.adjacency_matrix(G).toarray() G_pygsp = graphs.Graph(W) # Node degree matrix D = np.diag(G_pygsp.d) # Compute inverse for later D_inv = np.linalg.inv(D) # Convert to uppercase stations = [name.upper() for name in list(G.nodes)] stations = [unidecode(station) for station in stations] .. GENERATED FROM PYTHON SOURCE LINES 78-80 .. code-block:: Python metro_commutes, signal = metro_database_preprocessing(commutes, stations) .. GENERATED FROM PYTHON SOURCE LINES 81-101 .. code-block:: Python signal2 = signal.copy() station_idx = np.random.randint(0, len(signal2)) # station_idx = 24 print(f'Deleted Station: {stations[station_idx]}') signal2[station_idx] = np.nan mask = np.ones(len(signal)).astype(bool) mask[station_idx] = False # Use tikhonov regression to recover the signal recovered_signal = learning.regression_tikhonov(G_pygsp, signal2, mask, tau=0.5) # Compute the average of the nodes around the missing value average = (W @ D_inv @ signal)[station_idx] print(f'Estimated: {recovered_signal[station_idx]:.2f}') print(f'One-hop Average: {average:.2f}') print(f'Real: {signal[station_idx]:.2f}') .. rst-class:: sphx-glr-script-out .. code-block:: none Deleted Station: PRESIDENTE PEDRO AGUIRRE CERDA Estimated: nan One-hop Average: 10534.85 Real: 6297.80 .. GENERATED FROM PYTHON SOURCE LINES 102-121 .. code-block:: Python tikhonov_estimation = np.zeros_like(signal) average_estimation = W @ D_inv @ signal for i, s in enumerate(signal): # Allocate new signal signal2 = signal.copy() print(f'Deleted Station: {stations[i]}') # Delete value in the signal signal2[i] = np.nan mask = np.ones(len(signal)).astype(bool) mask[i] = False # Use tikhonov regression to recover the signal recovered_signal = learning.regression_tikhonov(G_pygsp, signal2, mask, tau=0.5) tikhonov_estimation[i] = recovered_signal[i] abs_err = np.abs(tikhonov_estimation - signal) .. rst-class:: sphx-glr-script-out .. code-block:: none Deleted Station: TOESCA Deleted Station: RONDIZZONI Deleted Station: ESCUELA MILITAR Deleted Station: ALCANTARA Deleted Station: EL GOLF Deleted Station: TOBALABA Deleted Station: PEDRO DE VALDIVIA Deleted Station: MANUEL MONTT Deleted Station: UNIVERSIDAD CATOLICA Deleted Station: SANTA LUCIA Deleted Station: REPUBLICA Deleted Station: UNION LATINOAMERICANA Deleted Station: ESTACION CENTRAL Deleted Station: CRISTOBAL COLON Deleted Station: FRANCISCO BILBAO Deleted Station: PRINCIPE DE GALES Deleted Station: SIMON BOLIVAR Deleted Station: BELLAS ARTES Deleted Station: CUMMING Deleted Station: UNIVERSIDAD DE SANTIAGO Deleted Station: SAN ALBERTO HURTADO Deleted Station: ECUADOR Deleted Station: LAS REJAS Deleted Station: PAJARITOS Deleted Station: SANTA ISABEL Deleted Station: MIRADOR Deleted Station: PEDRERO Deleted Station: CAMINO AGRICOLA Deleted Station: CARLOS VALDOVINOS Deleted Station: RODRIGO DE ARAYA Deleted Station: PARQUE O'HIGGINS Deleted Station: QUINTA NORMAL Deleted Station: LOS ORIENTALES Deleted Station: GRECIA Deleted Station: LOS PRESIDENTES Deleted Station: QUILIN Deleted Station: LAS TORRES Deleted Station: MACUL Deleted Station: ROJAS MAGALLANES Deleted Station: TRINIDAD Deleted Station: LOS QUILLAYES Deleted Station: ELISA CORREA Deleted Station: HOSPITAL SOTERO DEL RIO Deleted Station: PROTECTORA DE LA INFANCIA Deleted Station: BELLAVISTA DE LA FLORIDA Deleted Station: CEMENTERIOS Deleted Station: ZAPADORES Deleted Station: DORSAL Deleted Station: VESPUCIO NORTE Deleted Station: EL LLANO Deleted Station: SAN MIGUEL Deleted Station: LO VIAL Deleted Station: DEPARTAMENTAL Deleted Station: CIUDAD DEL NINO Deleted Station: LO OVALLE Deleted Station: EL PARRON Deleted Station: SANTA JULIA Deleted Station: LA GRANJA Deleted Station: SANTA ROSA Deleted Station: SAN RAMON Deleted Station: MANQUEHUE Deleted Station: HERNANDO DE MAGALLANES Deleted Station: LOS DOMINICOS Deleted Station: BARRANCAS Deleted Station: LAGUNA SUR Deleted Station: LAS PARCELAS Deleted Station: MONTE TABOR Deleted Station: LAS MERCEDES Deleted Station: SAN JOSE DE LA ESTRELLA Deleted Station: PARQUE BUSTAMANTE Deleted Station: GRUTA DE LOURDES Deleted Station: BLANQUEADO Deleted Station: PUDAHUEL Deleted Station: LO PRADO Deleted Station: SANTIAGO BUERAS Deleted Station: SANTA ANA Deleted Station: PLAZA DE PUENTE ALTO Deleted Station: SALVADOR Deleted Station: DEL SOL Deleted Station: PRESIDENTE PEDRO AGUIRRE CERDA Deleted Station: BIO BIO Deleted Station: INES DE SUAREZ Deleted Station: NUBLE Deleted Station: SAN PABLO Deleted Station: PATRONATO Deleted Station: CERRO BLANCO Deleted Station: EINSTEIN Deleted Station: BAQUEDANO Deleted Station: MONSENOR EYZAGUIRRE Deleted Station: CHILE ESPANA Deleted Station: VILLA FREI Deleted Station: LOS HEROES Deleted Station: LOS LIBERTADORES Deleted Station: CERRILLOS Deleted Station: LA MONEDA Deleted Station: PLAZA DE MAIPU Deleted Station: LO VALLEDOR Deleted Station: ESTADIO NACIONAL Deleted Station: NEPTUNO Deleted Station: FRANKLIN Deleted Station: EL BOSQUE Deleted Station: OBSERVATORIO Deleted Station: COPA LO MARTINEZ Deleted Station: HOSPITAL EL PINO Deleted Station: FERROCARRIL Deleted Station: LO CRUZAT Deleted Station: PLAZA QUILICURA Deleted Station: PARQUE ALMAGRO Deleted Station: SAN JOAQUIN Deleted Station: LOS LEONES Deleted Station: LA CISTERNA Deleted Station: MATTA Deleted Station: HOSPITALES Deleted Station: PLAZA CHACABUCO Deleted Station: CONCHALI Deleted Station: VIVACETA Deleted Station: CARDENAL CARO Deleted Station: PLAZA DE ARMAS Deleted Station: PUENTE CAL Y CANTO Deleted Station: UNIVERSIDAD DE CHILE Deleted Station: NUNOA Deleted Station: PLAZA EGANA Deleted Station: FERNANDO CASTILLO VELASCO Deleted Station: VICUNA MACKENNA Deleted Station: IRARRAZAVAL Deleted Station: VICENTE VALDES .. GENERATED FROM PYTHON SOURCE LINES 122-124 .. code-block:: Python fig, ax = plot_signal_in_graph(G, abs_err, title='Error of Tikhonov Regression', label='Error absoluto') #fig.savefig('metro_regression_tikhonov_error.png', dpi=300) .. image-sg:: /examples/images/sphx_glr_metro_regression_001.png :alt: Error of Tikhonov Regression :srcset: /examples/images/sphx_glr_metro_regression_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 125-126 Change variable to error with average estimation .. GENERATED FROM PYTHON SOURCE LINES 126-131 .. code-block:: Python abs_err = np.abs(average_estimation - signal) fig, ax = plot_signal_in_graph(G, abs_err, title=r'Error of $y = AD^{-1}x$', label='Error absoluto') #fig.savefig('metro_regression_error_abs.png', dpi=300) plt.show() .. image-sg:: /examples/images/sphx_glr_metro_regression_002.png :alt: Error of $y = AD^{-1}x$ :srcset: /examples/images/sphx_glr_metro_regression_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 10.303 seconds) **Estimated memory usage:** 330 MB .. _sphx_glr_download_examples_metro_regression.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: metro_regression.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: metro_regression.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: metro_regression.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_