Wind Speed & Direction Modeling

Accurate Wind Speed & Direction Modeling requires a rigorous geospatial pipeline that bridges meteorological observations with high-resolution spatial grids. Unlike scalar environmental variables, wind is inherently vectorial and height-dependent, demanding explicit handling of coordinate reference systems (CRS), circular statistics, and vertical extrapolation. Within broader Solar & Wind Resource Modeling Workflows, this stage serves as the critical bridge between raw anemometer/LiDAR data and bankable energy yield assessments. The following pipeline details production-ready Python implementations for spatial interpolation, directional vectorization, hub-height scaling, and compliance-ready raster generation.

1. CRS Harmonization & Station Data Ingestion

Spatial interpolation accuracy degrades rapidly when distance calculations are performed in geographic coordinates (EPSG:4326). The first production step standardizes all meteorological inputs into a local projected CRS optimized for distance preservation, typically a UTM zone or state plane. Memory-efficient ingestion requires strict schema validation, coordinate filtering, and explicit projection before GeoDataFrame construction.

python
import geopandas as gpd
import pandas as pd
import numpy as np
import logging
from pathlib import Path
from pyproj import CRS

logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

def prepare_station_gdf(csv_path: str | Path, target_epsg: int = 32612) -> gpd.GeoDataFrame:
    """
    Ingest wind station CSV, validate coordinates, and transform to projected CRS.
    Enforces memory-safe filtering prior to geometry construction.
    """
    path = Path(csv_path)
    if not path.exists():
        raise FileNotFoundError(f"Station data not found: {path}")

    df = pd.read_csv(path, dtype={"station_id": str})
    required_cols = {"station_id", "latitude", "longitude", "wind_speed_ms", "wind_dir_deg"}
    missing = required_cols - set(df.columns)
    if missing:
        raise ValueError(f"Missing required columns: {missing}")

    # Spatial validation: drop invalid coordinates
    valid_mask = (
        df["latitude"].between(-90, 90) &
        df["longitude"].between(-180, 180) &
        df["wind_speed_ms"].ge(0) &
        df["wind_dir_deg"].between(0, 360)
    )
    df = df.loc[valid_mask].copy()
    logging.info(f"Retained {len(df)} valid stations after spatial filtering.")

    gdf = gpd.GeoDataFrame(
        df,
        geometry=gpd.points_from_xy(df.longitude, df.latitude),
        crs=CRS.from_epsg(4326)
    )
    gdf_proj = gdf.to_crs(epsg=target_epsg)
    logging.info(f"Transformed to EPSG:{target_epsg} | Bounds: {gdf_proj.total_bounds}")
    return gdf_proj

2. Vectorized U/V Component Interpolation

Direct interpolation of wind direction (0°–360°) produces mathematical artifacts at the 0/360 boundary. The industry-standard approach decomposes direction into orthogonal U (east-west) and V (north-south) components, interpolates each independently, and reconstructs the vector field. For large domains, chunked grid generation prevents memory exhaustion during scipy interpolation routines (see official SciPy interpolation documentation).

python
from scipy.interpolate import griddata
from typing import Tuple

def decompose_and_interpolate(
    gdf: gpd.GeoDataFrame,
    grid_res_m: float = 100.0,
    method: str = "linear"
) -> Tuple[np.ndarray, np.ndarray, dict]:
    """
    Decompose wind vectors, interpolate U/V components, and return grid arrays + metadata.
    """
    coords = np.column_stack((gdf.geometry.x, gdf.geometry.y))
    speeds = gdf["wind_speed_ms"].values
    dirs_rad = np.deg2rad(gdf["wind_dir_deg"].values)

    # Vector decomposition (meteorological convention: wind FROM direction)
    u = -speeds * np.sin(dirs_rad)
    v = -speeds * np.cos(dirs_rad)

    # Define regular grid bounds
    minx, miny, maxx, maxy = gdf.total_bounds
    cols = int(np.ceil((maxx - minx) / grid_res_m))
    rows = int(np.ceil((maxy - miny) / grid_res_m))

    xi = np.linspace(minx, maxx, cols)
    yi = np.linspace(miny, maxy, rows)
    grid_x, grid_y = np.meshgrid(xi, yi)

    # Chunked interpolation to manage RAM footprint
    chunk_size = 10000
    u_grid = np.full_like(grid_x, np.nan)
    v_grid = np.full_like(grid_x, np.nan)

    flat_coords = np.column_stack((grid_x.ravel(), grid_y.ravel()))
    for i in range(0, len(flat_coords), chunk_size):
        chunk = flat_coords[i:i+chunk_size]
        u_grid.ravel()[i:i+chunk_size] = griddata(coords, u, chunk, method=method)
        v_grid.ravel()[i:i+chunk_size] = griddata(coords, v, chunk, method=method)

    metadata = {
        "transform": rasterio.transform.from_origin(minx, maxy, grid_res_m, grid_res_m),
        "crs": gdf.crs,
        "shape": (rows, cols),
        "bounds": (minx, miny, maxx, maxy)
    }
    return u_grid, v_grid, metadata

3. Hub-Height Extrapolation & Wind Shear Application

Turbine hub heights typically exceed measurement mast elevations, necessitating vertical extrapolation via the power law or logarithmic wind profile. The exponent (shear coefficient) varies significantly with terrain roughness and atmospheric stability. For detailed methodologies on deriving site-specific alpha values, refer to Calculating wind shear coefficients with Python. Once computed, the shear model is applied uniformly across the raster grid using vectorized broadcasting.

python
def apply_hub_height_scaling(
    u_grid: np.ndarray,
    v_grid: np.ndarray,
    alpha: float = 0.143,
    meas_height_m: float = 50.0,
    hub_height_m: float = 100.0
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Apply power-law wind shear to U/V grids. Handles NaN propagation safely.
    """
    if alpha < 0 or alpha > 0.5:
        logging.warning("Shear coefficient outside typical range [0.0, 0.5]. Verify site conditions.")

    ratio = (hub_height_m / meas_height_m) ** alpha
    # Mask NaNs to prevent invalid scaling
    u_scaled = np.where(np.isnan(u_grid), np.nan, u_grid * ratio)
    v_scaled = np.where(np.isnan(v_grid), np.nan, v_grid * ratio)
    return u_scaled, v_scaled

4. Async Rasterization & Spatial Validation

Raw interpolated fields often require correction for local topographic acceleration and terrain-induced flow distortion. Integrating elevation derivatives and slope/aspect masks aligns with established Terrain & Shadow Analysis Pipelines to capture microscale wind channeling effects. To maintain throughput during rasterization, asynchronous I/O decouples computation from disk writes, preventing thread blocking during large GeoTIFF serialization.

python
import rasterio
from rasterio.windows import Window
import asyncio
from concurrent.futures import ThreadPoolExecutor

async def write_raster_async(
    u_grid: np.ndarray,
    v_grid: np.ndarray,
    metadata: dict,
    output_path: str | Path,
    block_size: int = 512
) -> None:
    """
    Write U/V bands to a single GeoTIFF using async chunked I/O.
    Validates spatial bounds and CRS alignment prior to write.
    """
    out_path = Path(output_path)
    out_path.parent.mkdir(parents=True, exist_ok=True)

    profile = {
        "driver": "GTiff",
        "dtype": "float32",
        "count": 2,
        "width": metadata["shape"][1],
        "height": metadata["shape"][0],
        "crs": metadata["crs"],
        "transform": metadata["transform"],
        "compress": "deflate",
        "nodata": np.nan,
        "blockxsize": block_size,
        "blockysize": block_size,
        "tiled": True
    }

    # Spatial validation before I/O
    if not np.isfinite(u_grid).any() or not np.isfinite(v_grid).any():
        raise ValueError("Grid contains no valid interpolated data. Check station density.")

    loop = asyncio.get_running_loop()
    with ThreadPoolExecutor() as executor:
        with rasterio.open(out_path, "w", **profile) as dst:
            tasks = []
            for row in range(0, profile["height"], block_size):
                for col in range(0, profile["width"], block_size):
                    window = Window(col, row,
                                    min(block_size, profile["width"] - col),
                                    min(block_size, profile["height"] - row))
                    u_block = u_grid[window.row_off:window.row_off+window.height,
                                     window.col_off:window.col_off+window.width]
                    v_block = v_grid[window.row_off:window.row_off+window.height,
                                     window.col_off:window.col_off+window.width]

                    # Schedule async write
                    tasks.append(loop.run_in_executor(
                        executor, dst.write, np.stack([u_block, v_block]), range(1, 3), window=window
                    ))
            await asyncio.gather(*tasks)

    logging.info(f"Async rasterization complete: {out_path}")

5. Compliance Packaging & Quality Assurance

Final outputs must adhere to CF-Conventions and OGC GeoTIFF standards for downstream compatibility. Validation routines verify CRS alignment, check for NaN propagation, and ensure metadata completeness. These practices mirror the rigorous quality gates applied in Solar Irradiance Raster Processing, ensuring cross-technology interoperability within hybrid renewable portfolios. A final compliance check should verify:

  • Band descriptions (wind_speed_u_ms, wind_speed_v_ms, hub_height_m)
  • Projection consistency with project boundary shapefiles
  • Statistical sanity checks (e.g., max wind speed < 95th percentile of regional climatology)

Production deployments should wrap the pipeline in a configuration-driven orchestrator (e.g., Prefect or Airflow) to handle temporal aggregation, re-try logic for failed chunks, and automated metadata cataloging.

Conclusion

Wind speed and direction modeling demands explicit vector decomposition, memory-aware interpolation, and height-dependent scaling to meet utility-grade accuracy thresholds. By enforcing strict CRS harmonization, leveraging async chunked I/O, and embedding spatial validation at each pipeline stage, engineering teams can generate reproducible, bankable wind resource grids. Integrating these outputs with terrain-aware microclimate corrections and standardized raster packaging ensures seamless interoperability across energy yield modeling stacks and regulatory compliance frameworks.