RINEX Reader

Github Repo

This was my first major project with Python. Most of this post will be taken from the assignment submission. The project was part of my GPS course in undergrad; the project involved creating a Python program to get an epoch by epoch least squares position solution of a GPS receiver using its observation and navigation RINEX files. This project was done in part with Patrick Lasagne.

Objectives

Develop Python functions to process RINEX observations and navigation files and store the data in data structures.
Develop Python functions to determine satellite position using the navigation files.
Develop Python functions to compute and apply ionospheric and tropospheric corrections, as well as satellite offsets, and apply them to observation pseudoranges where applicable.
Run an epoch by epoch least squares adjustment to determine the location and clock offset of the receiver.
Compute various DOP values using the covariance matrix from the least squares adjustment.
Determine the accuracy and precision of the solution.
Explain why the values were acquired.

Software Structure

RINEX stands for Receiver Independent Exchange Format, it is stored in an ASCII format and contains GNSS Observations. The two RINEX files that the software reads are the observation and navigation files.

It was decided that the information stored in the observation and navigation files would be stored in a Python dictionary data structure. A dictionary holds a set of unique "key-value" pairs that allow information to be accessed rapidly, without having to deal with messy array indices. The only downside of dictionaries is that there is no reliable way to iterate through a dictionary in a certain order; to counteract this, each dictionary that may need to be iterated over also includes a list of keys corresponding to the key 'LIST' in the order they were placed into the dictionary.

The parsing of the RINEX navigation file is done line by line and character by character. Each element in the RINEX file is given a certain amount of spacing to fill up. This information on the spacing is found in the GAGE GROUP's gLab RINEX file format description.

Observation File Reader

The observation RINEX file contains all of the GPS receiver's observations that it has collected over its setup time. The file is split into two components: the header, which contains important information about the receiver itself, and the data section, which contains phase and pseudorange observations to various satellites at various epochs.

Header Section

The header section contains information relating to the receiver, such as its position in a given coordinate system, its antenna height, or its make and model. It also contains clerical information pertaining to the observation set, and may include extra information present in comments. Most importantly, the header details the type of observations present in the RINEX file.

The header contains information for the first 60 characters (there are 80 on a line), the remaining 20 constitute a label which explains what is detailed in the previous 60 characters, each line is then terminated with a new line break (\n). To fully parse the header, the RINEX file is read in line by line. At each line, the label is checked to see what kind of header information is present on that line, this continues until END OF HEADER is reached. While the software attempts to read the header, it consults the last word of the label, and attempts to match it with a predefined list (with HTYPER method). In the event that two or more label types have the same last word, then the second last word is used instead. Once the label is identified, the software parses the line according to it (with ASSIGNDIC). In the event # / TYPES OF OBSERV occurs on more than two lines, the list that holds the observation types has the extra values appended to it. Only a few important keys present in the header are used:

Key	Value	Description
`'POS'`	Dictionary with sub keys 'X', 'Y', and 'Z'	Contains the X, Y, and Z position of the receiver.
`'OBSTYP'`	Dictionary with sub keys 'NUM' and 'OBS'	'NUM' contains the integer number of satellites; 'OBS' contains a list of the two alphanumeric GPS observation types (e.g. 'C1', 'P1')

Data section

The data section contains the epoch by epoch observations by the receiver to the satellite. The reader reads the first line, which has the epoch, the total number of satellites and the list of the satellites. The epoch is made into a key in the form 'YY:MM:DD:HH:MM:S.sssssss', it maps to a dictionary that will be used to include all satellite observations. The reader then uses the number of satellites present and the number of observations expected to determine how many lines remain in the observation block.

Key	Value	Description
`'YY:MM:DD:HH:MM:S.sssssss'`	Dictionary with sub keys 'PRN', 'LIST', 'NUMSAT'	The key refers to the observation block epoch. 'LIST' maps to a list of all satellite PRN numbers (for iterating); 'NUMSAT' refers to the integer number of satellites.
`'PRN'`	Dictionary with sub keys 'C1', 'P1', 'P2', etc.	Contains the observations for the given satellite at the given epoch.

Dictionary Structure

{
    '07:1:1:0:0:0.0000000':{
        'G06':{
            'C1':20857582.237,
            'P1':20857582.016,
            'P2':20857580.516,
            },
        'G07':{
            'C1':20418227.176,
            'P1':20418226.502,
            'P2':20418225.415
            },
        'G10':{
            'C1':23445664.779,
            'P1':23445663.158,
            'P2':23445663.052,
        },
        'LIST':['G06','G10','G07'],
        'NUMSAT':10
    },
    '07:1:1:0:0:30.0000000':{
        'G06':{
            'C1':20868360.224
            ...
        }
    }
}

Navigation File Reader

Header Section

The header file contains metadata information about the RINEX file such as the RINEX version. The line identifier in the header has reserved space of 20 characters at the end of the line. Each RINEX file has 80 characters per line therefore the location of the header labels are always known. For each header label the location of the variables is defined by the index of the character spacing. Besides the RINEX version the header contains comments about the RINEX file, and 3 optional pieces of information such as the ephemeris coefficients of the Ionosphere alpha and beta. Also optionally included in the header is the clock offset coefficients of UTC from GPS and the number of leap seconds the current GPS time is from UTC. The end of the header is defined by the label 'END OF HEADER'.

Data Section

The navigation data is sorted in the RINEX file by the GPS PRN number then epoch the ephemeris data is valid for. Each element of the line has a certain index and spacing; the data is parsed until the 8th line is reached and a new PRN number is parsed. Each broadcast contains 29 variables that the satellite broadcasts. These 29 variables are given in a block of 8 lines and 4 columns; one of the spaces is for the PRN number and the epoch while the other 2 spaces of the 32 spaces are left empty in the RINEX 2.11 version.

The data is collectively stored with a nested dictionary, where the top most key is the GPS PRN number, then the data blocks are divided by subkeys of the epoch that the broadcast data is valid for.

Key	Value	Description
`'PRN'`	Dictionary with sub keys epoch	Contains the 29 navigation parameters and broadcasted navigation information by the satellite.

Navigation Dictionary Structure

{
    'G01':{'10:1:1:16:0:0.0':{
        'Cic':5.587935447693e-09,
        'Cis':7.636845111847e-08,
        'Crc':183.21875,
        'Crs':-165.46875,
        'Cuc':-8.651986718178e-06,
        'Cus':1.038610935211e-05,
        'Deln':5.447369761959e-09,
        'Ecc':0.003840734250844,
        'Fit_int':4.0,
        'GPS_W':1564.0,
        'IDOT':-7.000291590243e-11,
        'IODC':51.0,
        'IODE':51.0,
        'Io':0.9635689853388,
        'L2_CC':1.0,
        'L2_P':0.0,
        'Mo':0.0841761497488,
        'OMEGA':-0.1005044703918,
        'OMEGA_DOT':-7.951402636407e-09,
        'Omega':0.9428572252394,
        'SV_Acc':2.0,
        'SV_CLB':-7.538078352809e-05,
        'SV_CLD':-3.069544618484e-12,
        'SV_CLR':0.0,
        'SV_Health':63.0,
        'SqrtA':5153.668302536,
        'TGD':-1.909211277962e-08,
        'TOE':489600.0,
        'Trans_time':487788.0
    }}
}

Satellite Position

To compute the receiver coordinates, first the coordinates of the satellite must be determined. The Keplerian elements of the satellite are broadcasted in the Navigation file. Each satellite in the GPS constellation broadcasts its position prediction in orbit for each two hours. Once all 29 parameters in the RINEX navigation file are read, the process of determining the satellite position occurs in the module satpos.py.

The process of determining satellite positions is outlined in detail in the ICD document. The algorithm is defined in section 20.3.3.4.3 User Algorithm for Ephemeris Determination Pages 103–105, Table 20-IV. The eccentric anomaly is solved using an iterative process until the difference between the current value and the previous value is smaller than a threshold of 0.000000000001. The final output is the ECEF coordinates of the satellite.

Corrections

Ionospheric Correction

When using a dual frequency receiver, the delay due to the ionosphere can be nearly mitigated with a linear combination of the two pseudorange equations. The ionosphere-free linear combination is:

$$PR=\frac{PR_2-\gamma PR_1}{1-\gamma}$$

PR is the iono free corrected pseudorange, $PR_1$ and $PR_2$ are the pseudorange observations on two frequencies, and $\gamma$ is the squared ratio between the two carrier wave frequencies:

$$\gamma =\left(\frac{f_1}{f_2}\right)^2 = \left(\frac{1575.42}{1227.6}\right)^2 = \left(\frac{77}{60}\right)^2$$

Tropospheric correction

When the GPS signal travels through the atmosphere it is refracted by the medium, which delays the signal. This delay can be modeled using the Saastamoinen Closed Form model:

$$dtrop=\frac{0.002277}{\cos(z)}\left[P_0 + \left(\frac{1255}{T} + 0.05\right)e_0 - \tan^2(z)\right]$$

Where dtrop is the tropospheric correction in meters, z is the zenith angle, P_0 is the atmospheric pressure in mbar, T is the temperature in Celsius, and e_0 is the water vapour pressure in mbar. This element of the software did not function properly and was omitted from the final solution.

Least Squares Filter

The receiver coordinates and its clock offset must be estimated using an epoch by epoch Least Squares solution. The model equation is the pseudorange formula:

$$PR=\sqrt{(X_s - X_r)^2 + (Y_s - Y_r)^2 + (Z_s - Z_r)^2} + c(t_r - t_s)$$

PR is the pseudorange between the satellite and the receiver. The square root term is the geometric range from satellite to receiver, where terms with subscript "s" are satellite coordinates and subscript "r" are receiver coordinates. c is the speed of light; $t_r$ is the receiver time offset; $t_s$ is the satellite vehicle offset.

The observables form a vector l:

$$l = [PR_1, PR_2, PR_3 \ldots PR_n]$$

The unknowns are the receiver position and clock offset:

$$x = [X_r, Y_r, Z_r, t_r]$$

The parametric adjustment model:

$$l = f(x, c)$$

Because the pseudorange equation is non-linear, it must be expanded using Taylor Series Expansion using only the first two terms:

$$f(x) \approx f(x_0) + \frac{1}{1!}\left. \frac{df}{dx}\right|_{x=x_0} ( x - x_0)$$

The initial estimates were chosen to be the listed X, Y, and Z position of the station in the header, with a time offset of 0. The first design matrix (A) is composed of partial derivatives of each pseudorange function with respect to each unknown:

The partial derivatives of the pseudorange equation with respect to each unknown:

Partial derivatives of pseudorange equation

The misclosure vector w:

$$w = l - f(x_0)$$

Delta (the correction to the estimates):

$$\delta = (A^TA)^{-1}A^Tw$$

Each iteration, delta is added to $x_0$, then w and A are recomputed. After a solution for an epoch is found, the covariance matrix of the unknowns is:

$$C_x = (A^TA)^{-1}$$ Covariance matrix

The variances on the diagonal are used to compute the DOPs of the epoch. The adjustment process is completed for each epoch; initial estimations from the previous epoch are used as starting estimates for the next.

Analysis Terms

Standard Deviation

The standard deviation is a statistical measure of accuracy. One standard deviation defines a 68% confidence interval. It is the square root of the diagonal elements of the covariance matrix of the unknowns (X, Y, Z, and clock offset).

Dilution of Precision

The Dilution of Precision (DOP) is a measure of solution precision. The two types used in this report are Geometric DOP (GDOP) and Position DOP (PDOP):

$$GDOP = \sqrt{\sigma_x^2 + \sigma_y^2 + \sigma_z^2 + \sigma_t^2 c^2}$$ $$PDOP = \sqrt{\sigma_x^2 + \sigma_y^2 + \sigma_z^2}$$

Analysis

Test Site ALBH

The RINEX dataset were provided by the Teaching Assistant. The station used for the positioning solution was designated ALBH, located near Victoria, British Columbia. ALBH is a continuously tracking GNSS site, part of the Western Canada Deformation Array (WCDA) and the Canadian Active Control System (CACS). The ECEF coordinates published by Natural Resources Canada's CACS website:

X = -2341333.003 ± 0.0007
Y = -3539049.514 ± 0.0008
Z =  4745791.300 ± 0.0009

Satellite Positions

The RINEX observations are recorded every 30 seconds, so there are approximately 2600 epochs of measurements. The satellite positions were visualized to verify correctness — the expectation is a path similar to the sun rise and fall motion.

Positions

The final solution of the receiver coordinates can be plotted to visualize the quality of the solution. Below is a scatter 2D plot of the receiver X and Y coordinates.

Scatter plot of receiver X and Y coordinates

Looking at the above figure there seems to be an irregularity with the position solutions. The expected accuracy of the Standard Position Solution is meter level; the final solutions do not satisfy this. There are several positions several hundreds of meters away from the centered cluster. One interesting characteristic is the streaks in the positions, hinting towards a correlation with time — there is likely a blunder in the programming obtaining an incorrect satellite position or time. Due to time constraints the debugging was not completed.

Coordinate difference from published coordinates

As one can notice from the above figure, the initial solutions have a stable deviation of about meter level for X and Z, 10s of meters for the Y component for about the first 1000 epochs, then the coordinates deviate to 100s of meters past the 1100th epoch.

The solutions were averaged to obtain a general estimate of the position of ALBH:

Component	Value	Standard Deviation
X [m]	-2341315.307	15.103
Y [m]	-3539059.522	34.744
Z [m]	4745791.669	31.694
t [s]	-3.39126190338e-08	3.06918042012e-07

Difference between the solutions and the published coordinates:

Component	Coordinate difference [m]
X	17.695
Y	-10.008
Z	0.369

DOP Values

The DOP of the final solution can be visualized to see if there is any correlation between the epoch when the discrepancy began.

There is a noticeable spike in the GDOP around the 500th epoch, though this does not correlate with the coordinate difference discrepancy. The average GDOP for the timeline of the observation file was 2.76, which is quite good.

The figure above illustrates the standard deviation of each element in the solution. The huge spike in the GDOP plot is mostly contributed by the Y and Z components. Due to time constraints the problem could not be pinpointed and solved.

Conclusion

This report involves all the software components needed to compute a Standard Positioning Service GPS navigation solution. The software was tested for a test case of observation data; although a position was outputted, the quality of the solution was not acceptable. The software bugs that caused the deviation could not be determined over the timeline of this project due to time constraints. If there was more time, the tropospheric delay of the GPS signal could be implemented and tested, as well as the Klobuchar model for determining the ionospheric delay for a single frequency receiver.