9/22/2010
‹#›
IntelliDrive is a concept that leverages technology to promote connectivity among vehicles, roadway infrastructure, and wireless devices.
Connectivity will create a data-rich environment, providing substantial opportunities to make surface transportation safer, smarter, and more environmentally friendly.
The mobility program area includes: Real time Data Capture and Management and Dynamic Mobility Applications.
Data from multiple sources and of multiple types (such as location data, transit data, weather data, vehicle status data and infrastructure data) are captured, cleaned and integrated into a data environment.  The data is then used by multiple applications: enhanced weather applications (e.g., real-time weather advisory information or warning systems)
real-time transit signal priority
real-time traveler information systems
environmental applications, such as eco drive
safety alerts, queue warning systems
Real-time Data Capture and Management addresses the capture, cleaning and integration of data in real time.  Dynamic Mobility applications addresses the use of data in real time to develop and deploy enhanced or transformative mobility applications.
The materials in the next several slides were drawn from the vision documents. We will need feedback from you on the approach, especially where things differ from the norm.  Are there too many risks?  We are looking for feedback on the different ways for data capture and management, distribution, and other concepts that are relatively new for a federally funded research program.
3
What is a data environment and how is that any different from what we have done in the past?
[Figure]:  Data is pulled into the environment - the orange globe.  Data users can extract information from it to build or support applications.  The data environment can have multi-source data, which may be observed, simulated or interpolated data, and all the elements needed to develop an application.
In some cases, the observed data might be flagged as being erroneous.  In other cases, high quality observed data might be interpolated to help with the assessment of applications if there isn’t sufficient market penetration of certain technologies.
4
Next we will look at the elements of data capture and management.  The orange globe represents data environment consisting of high quality data.
The value of high-quality data is limited if it is not well documented, if there is no supporting meta data, if data users have to spend time and resources figuring out the content and structure of the data environment.  So a key element is the provision of a well documented data environment. 
Next we need a mechanism to access the data, to allow data users to collaborate with each other, and to get their questions answered.  The intent here is not to build a giant, federal database.  We will try to identify the most logical way of supplying data that meets specific needs.  One approach is to have individual streams of data maintained by those who capture the data, and then use virtual warehousing techniques to combine data from multiple locations on the fly to serve particular applications or researchers.
Next we need the history or context of data capture.   Why was the data collected?
Finally, like all good data environments we need to have a governing structure, the rules of engagement.  Who can put data in?  Who can use it?  What are the rights and responsibilities of the data contributors and the data users?  The rules might change from one data environment to another. 
5
7
The first example is a portion of an OBE log file, showing timestamps, OBE ID (B420 in this case) and the start of the data fields.  The first 11 lines are followed by the start of a snapshot appearing in the log file at the time it is transmitted. Note that the Data Environment does not include all  OBE log files; it includes all log files for one day chosen for each of the three data sets.
The second example is a portion of a trajectory file, created by Noblis by extracting the second-by-second position records from the log file, and adding X-Y values corresponding to the given latitude/longitude values.  The data environment contains trajectory files for all vehicles on all days.
The sample map was created by (1) running the Trajectory Plotter program , reading part of a trajectory file to create a KML file, and (2) inputting the KML file to Google Earth.  Within Google Earth there is pan and zoom capability to look at trajectories in detail.
The data environment also contains files with all snapshot data extracted from the OBE log files (including duplicates and snapshots not received by RSEs) and files listing all snapshot generation, RSE acquisition and transmission, and Probe Segment Number (PSN) change events.
Documentation for these files is included in the data environment.
8
The first example is a portion of an XML file containing one message received by an RSE from a vehicle.  The message may contain from one to four snapshots.  It is in XML format conforming to the J2735 DSRC standard.  The data environment contains one XML file for each message received.
The second example is a spreadsheet containing all the information in the snapshots received by RSEs, but in table form, usable by a spreadsheet or a database.  The SS# column is the snapshot number within the message; SS 0 is the information in the message header.  Noblis has added some quality control flags to indicate variables with questionable  or erroneous values.  There is one such file for each day in the data environment, containing all snapshots received by all RSEs on that day.
Documentation for these files is included in the data environment.
9
The TCA program is an open source program written in Python.  It can be configured to model a variety of snapshot generation / transmission policies and PSN change policies.  Given a set of vehicle trajectories and RSE locations, the TCA produces a set of snapshots that would have generated, transmitted, and received by RSEs following the given policies.  The program also reads parameters defining the buffer management policy, and models the deletion of snapshots when buffer size is exceeded.
With different input parameters, the TCA can model direct transmission of snapshots to a central location with cellular communication rather than to individual RSEs.  TCA does not model transmission failures or losses.
TCA documentation and sample input files are included in the data environment.
10
UMTRI has created a Paramics model of the testbed area, and has run a simulation of the morning rush hour from 6 am to 11 am.  The model  can handle up to 10,000 vehicles in the network at one time,  UMTRI has validated the model scenario with traffic counts obtained from SEMCOG.  In addition to modeling the traffic,  UMTRI has added routines to model PSN changes, snapshot generation, and interaction with RSEs.  The result is a set of files containing snapshot information received by RSEs.  These files are far larger than the RSE files for the POC and NCAR trials, because every one of the thousands of vehicles is a probe vehicle. 
The data environment contains a set of files with all the snapshots received by each RSE from 6 am to 11 am, and another set of files with snapshots received during the peak hour from 8 am to 9 am.
The sample data file is a portion of one of the RSE snapshot files.
The map shows the street network present in the Paramics simulation.  The red dots show the location of the RSEs.
11
This screen shot illustrates the data files available on the Data Capture & Management Portal.  The panel on the left shows the data sets that are available, and the other features, including projects, forums, members, and website links.
The right panel shows an overview of the POC trials and the start of the list of files related to the POC trials.   The list of files can be focused by selecting from the “File Type’ and ‘IntelliDrive Type’ selection boxes.
12