Skip to content
Snippets Groups Projects
Commit bcb1ce70 authored by Riccardo Boero's avatar Riccardo Boero :innocent:
Browse files

Added LICENSE etc.

parent 4fbf784a
No related branches found
No related tags found
No related merge requests found
# FACT Employment Official Statistics Changelog
All notable changes to this project will be documented in this file.
Release mentioned here correspond to docker registry entries and use semantic versioning as in 'MAJOR.MINOR.PATCH'.
## Change entries
Added: For new features that have been added.
Changed: For changes in existing functionality.
Deprecated: For once-stable features removed in upcoming releases.
Removed: For features removed in this release.
Fixed: For any bug fixes.
Security: For vulnerabilities.
## [0.2] - 2024-10-02
Added:
- license
- changelog
- contributing
Fixed:
- update of docker baqse image
- update of buildings database
- table partitioning by country/state
## [0.1] - 2023-10-19
Added:
- first full structure and dataset creation
\ No newline at end of file
# Contributing to this project
Thank you for your interest in contributing to FACT! We welcome contributions from everyone and value your input and efforts to improve this project. Below, you will find guidelines and instructions on how to contribute effectively and ensure a smooth collaboration process.
## Ways to Contribute
There are many ways to contribute to our project:
- **Reporting bugs**: File issues for any bugs you encounter.
- **Feature suggestions**: Suggest new features or improvements to existing features.
- **Documentation**: Help improve the project documentation.
- **Code contributions**: Contribute bug fixes or implement new features.
## Submitting Contributions
Please follow these steps to submit your contributions:
1. **Create a branch** for your changes.
2. **Make your changes**: Follow our coding guidelines and write clean, testable Go code.
3. **Write tests**: Add or update tests as necessary for your changes.
4. **Run all tests** to ensure nothing else was accidentally broken.
5. **Submit a pull request**:
- Push your branch to Gitlab and open a pull request against the main branch.
- Provide a clear description of the changes and any relevant issue numbers.
## Style Guide and Coding Conventions
- Use clear and descriptive variable names.
- Document all public functions and packages.
- Keep functions focused and well-organized within packages.
- Update changelog accordingly
## Review Process
Our team will review all pull requests as soon as possible. During the review, we may ask for additional changes or clarification. Pull requests must be approved by at least one maintainer before merging.
Contributions that include new features or substantial changes should be discussed in the issue tracker or discussions before starting work.
Thank you again for considering contributing to FACT. We look forward to your contributions and are excited to see what we can achieve together!
This diff is collapsed.
......@@ -3,14 +3,10 @@ ML detected building footprints and height open database. Bing Maps open buildin
---
## Use
Connect to the GitLab container registry with privileges for the FACT group:
>docker login -u FACT_token -p glpat-1zG-TQx___xLsPjj2vsG registry.git.nilu.no
Run this data service:
1. Pull the image:
> docker pull registry.git.nilu.no/fact/data/fact_bldgs:latest
2. Run this data service:
>docker run -t -i --name fact_bldgs -e MARIADB_DATABASE=FACT_bldgs -e MYSQL_ROOT_PASSWORD=devops -p 3315:3306 -d registry.git.nilu.no/fact/data/fact_bldgs:0.1
The container makes available a MariaDB instance with the full database on airports and traffic **FACT_bldgs**. It is reachable on port 3315 of the localhost and the root password is 'devops'.
......@@ -40,7 +36,14 @@ Not used here:
https://sites.research.google/open-buildings/
---
### Authors
## Author
Riccardo Boero - ribo@nilu.no
### License
## License
The data and software in this repository are licensed under theOpen Data Commons Open Database License [(ODbL) v1.0](https://opendatacommons.org/licenses/odbl/1-0/)
## Citation
- Boero, Riccardo. 2024. “FACT Data: Buildings Footprint and Height.” OSF. [doi:10.17605/OSF.IO/CHDJR](https://doi.org/10.17605/OSF.IO/CHDJR).
Part of the Fine scAle eConomic daTa - FACT project:
- Boero, Riccardo. 2024. “Fine scAle eConomic daTa - FACT.” OSF. [doi:10.17605/OSF.IO/PV4ZW](https://doi.org/10.17605/OSF.IO/PV4ZW).
\ No newline at end of file
gen_data.sh 100755 → 100644
File mode changed from 100755 to 100644
import os
import pandas as pd
import geopandas as gpd
from shapely.geometry import shape
from concurrent.futures import ThreadPoolExecutor
# Define the list of locations you're interested in
interested_locations = ['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'CzechRepublic', 'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', 'Poland', 'Portugal', 'Romania', 'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Iceland', 'Switzerland', 'Liechtenstein', 'Norway', 'UnitedKingdom', 'Ukraine', 'UnitedStates']
def process_file(row):
file_path = f"data/{row['QuadKey']}.geojson"
if row['Location'] in interested_locations and not os.path.exists(file_path):
df = pd.read_json(row['Url'], lines=True)
df['geometry'] = df['geometry'].apply(shape)
gdf = gpd.GeoDataFrame(df, crs=4326)
gdf.to_file(file_path, driver="GeoJSON")
def main():
# Create the directory if it does not exist
os.makedirs('../data', exist_ok=True)
dataset_links = pd.read_csv("https://minedbuildings.blob.core.windows.net/global-buildings/dataset-links.csv")
for _, row in dataset_links.iterrows():
df = pd.read_json(row.Url, lines=True)
df['geometry'] = df['geometry'].apply(shape)
gdf = gpd.GeoDataFrame(df, crs=4326)
gdf.to_file(f"../data/{row.QuadKey}.geojson", driver="GeoJSON")
if not os.path.exists('data'):
os.makedirs('data')
# Use ThreadPoolExecutor to process files in parallel
with ThreadPoolExecutor(max_workers=20) as executor:
executor.map(process_file, dataset_links.to_dict(orient='records'))
if __name__ == "__main__":
main()
File mode changed from 100755 to 100644
#!/bin/bash
# Securely load database credentials
DB_HOST="plazablanca.nilu.no"
DB_USER="root"
DB_PASS="impadminPSWD!"
DB_NAME="FACT_bldgs"
# Path to the subdirectory
dir="data"
# Ensure the directory exists
if [ ! -d "$dir" ]; then
echo "Directory not found: $dir"
exit 1
fi
# Maximum number of parallel ogr2ogr instances
max_parallel=10
# Process the first file separately
first_file=true
# Array to hold the PIDs of the background processes
declare -a pids
# Total number of files
total_files=$(find "$dir" -type f | wc -l)
processed_files=0
remaining_files=$total_files
for file in "$dir"/*; do
# Skip if not a file
if [ ! -f "$file" ]; then
continue
fi
if [ "$first_file" = true ]; then
echo "Processing first file: $file"
ogr2ogr -f MySQL MySQL:"$DB_NAME,host=$DB_HOST,user=$DB_USER,password=$DB_PASS" "$file" -nln footprints -update -overwrite -lco engine=Aria
first_file=false
((processed_files++))
((remaining_files--))
echo "Processed files: $processed_files, Remaining: $remaining_files"
else
# Start processing the file in the background
echo "Processing file in parallel: $file"
ogr2ogr -f MySQL MySQL:"$DB_NAME,host=$DB_HOST,user=$DB_USER,password=$DB_PASS" "$file" -nln footprints -update -append &
pids+=($!)
((processed_files++))
((remaining_files--))
echo "Processed files: $processed_files, Remaining: $remaining_files"
# Check if we need to wait for any process to finish
while [ ${#pids[@]} -ge $max_parallel ]; do
# Check each process if it's still running
for i in "${!pids[@]}"; do
if ! kill -0 "${pids[$i]}" 2>/dev/null; then
# Process is finished, remove it from the array
unset 'pids[i]'
fi
done
# Update the array to remove any gaps
pids=("${pids[@]}")
sleep 1 # A short delay to prevent the loop from consuming too much CPU
done
fi
done
# Wait for any remaining background processes to finish
wait
echo "All processing complete."
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment