Who will take care of Data Privacy on Autonomous Vehicles?

Implications of data privacy once autonomous vehicles will be commercially available.

15 January 2021, by Mario Sabatino RiontinoAsk a question

Figure 1: Car about to stop at pedestrian crossing.Photo by Wesley Armstrong on Unsplash
Figure 1: Car about to stop at pedestrian crossing.Photo by Wesley Armstrong on Unsplash

Autonomous Vehicles (AV) at a glance

There is a great deal of talk about the advent of autonomous vehicles, promising to disrupt the way we consume mobility. As the development and testing of self-driving car technology have progressed, the prospect of privately-owned self-driving cars operating on public roads is getting closer. Navigant Research forecasted that 94.7 Million vehicles with self-driving capabilities will be sold annually by 2035.

Autonomous vehicles (AVs) use technology to partially or entirely replace the human driver in operating a machine from point A to point B while responding to traffic conditions.

According to McKinsey, while AV technology presents revolutionary change, its adoption will be evolutionary:

“We expect Level 4 autonomy—operating within virtual geographic boundaries—to be disruptive and available between 2020 and 2022, with full adoption coming later. Full autonomy with Level 5 technology—operating anytime, anywhere—is projected to arrive by 2030 at the earliest, with greater adoption by that time.“

Levels of Driving Automation

Level Name Description
0 No automation The human performs all the driving tasks such as steering, acceleration, braking, etc.
1 Driver assistance The vehicle includes a single automated system (e.g. cruise control).
2 Partial automation The vehicle can steer and brake autonomously. However, the human needs to monitor and take control at any time.
3 Conditional automation The vehicle can perform any driving tasks, but human monitoring is still required.
4 High automation The vehicle can perform any driving tasks, but geofencing is required. Human overriding is still an option in case of emergency.
5 Full automation The vehicle performs all kinds of tasks under all conditions. No human interaction is required.

Any talk on AVs cannot miss mentioning the flood of data that would get generated. A study by Intel, discussed by Intel CEO Brian Krzanich at Automobility LA, suggests that just one autonomous vehicle will generate around 4,000 GB (4 terabytes) of data every day. Also, it is estimated that, in the foreseeable future, autonomous vehicles would generate over 300 TB data each year in the US alone.

What Data is collected by Autonomous Vehicles?

AV generates, collects, analyzes and stores an immense amount of data, ranging from basic navigation to other complex functionalities like traffic management, speeding enforcement and knowhow about driving circumstances. These data can be classified under three main categories:

  • Owner and Passenger Information: Autonomous vehicles may collect and maintain identifying information about the owner or passenger of the vehicle for a variety of purposes, such as to authenticate authorized use, or to customize comfort, safety, and entertainment settings.

  • Location Data: used for navigation purposes, such as route information, speed, real-time traffic data and points-of-interest along the planned route.

  • Sensor Data: autonomous vehicles (but also normal vehicles) contain sensors that collect data about the vehicle’s operation and its surroundings, including cameras or dash cams - front, rear and side cameras - radar, thermal imaging devices, and light detection and ranging (LiDAR) devices that collect data (e.g. street-level panorama images, point clouds, etc.) to determine the objects it encounters, make predictions about the surroundings, and take action based on these predictions.

Addressing Privacy Concerns upon Autonomous Vehicles Data

Back in 2014, Jim Farley - Chief Executive Officer at Ford (but Global Vice-President of Marketing & Sales at that time) - said during the Consumer Electronics Show:

“We know everyone who breaks the law, we know when you’re doing it. We have GPS in your car, so we know what you’re doing. By the way, we don’t supply that data to anyone.”

Although he later retracted the statement, Farley’s quote highlights the privacy implications of data collection and use in vehicles. Personal information about an autonomous vehicle user’s locations, on-road behaviour and inside/outside cameras may be valuable to various government and private sector entities.

Like any new breakthrough technology, the large-scale collection and analysis of AVs could lead to both benefits and concern the public. For instance, insurers might adjust car insurance rates based on whether a customer speeds on the highway, regimes could identify dissidents, or an employee of the AV company could use the technology to stalk a celebrity or an ex-girlfriend, as it happened in Uber.

Researchers have surveyed residents of cities with and without Uber autonomous vehicle fleets: 54% of participants said that they would spend more than five minutes using an online system to opt-out of identifiable data collection. In addition, they showed high discomfort for secondary use scenarios such as recognition, identification, and tracking of individuals/vehicles.

A variety of measures, aimed to protect personal information, may be employed to increase the general acceptance of autonomous vehicles.


As we discussed in a previous article about data protection for street-level imagery, several data protection laws are currently regulating (more or less strictly) the usage of personal data: GDPR (Europe), CCPA (California), PIPEDA (Canada) and APPI (Japan), just to name a few. However, data privacy laws must be integrated with existing road traffic laws, which regulate vehicles homologations, traffic enforcement and liability, i.e. whether the driver, manufacturer or software would be held liable in the event of a collision.

For example, the US Driver Privacy Act of 2015 states that information collected by EDRs (event data recorders) belongs to the owner or lessee of the vehicle and restricts data retrieval from EDRs to certain exceptions (e.g., court orders, vehicle safety research, or to service or repair the vehicle).

EU member states programs are at different stages (e.g. Germany, France, Italy and Spain) and yet not align at the European level. In Germany, the Autonomous Vehicle Bill was enacted in June 2017 modifying the existing Road Traffic Act, and defining the requirements for highly and fully-automated vehicles, while also addressing the rights of the driver.

Privacy by Design

Despite comprehensive simulation models across the entire design lifecycle are blooming, real-world data still matter. Thus, data collection is essential to generating possible scenarios of everything that can happen on the road.

When it comes to data collection, GDPR suggests to follow the data minimization principle, i.e. limiting personal data collection to what is necessary while ensuring full-functionality.

Oftentimes, privacy is designed separately from the rest of the automobile or left as an integration task during manufacturing, instead of being addressed as early as possible. A study from IBM showed that finding a security error in the design phase costs less than a sixth of the cost of finding it in the implementation phase, 15 times less than during the testing phase and 1/100th of the cost as in product maintenance.

To avoid such situations, privacy by design approaches recommend that companies should implement technical and organisational measures at the earliest stages of the product/service development, to ensure privacy right from the start. This could be embraced from idea definition (requirements, risk assessment, etc.) to the validation phase (field test and proof of safety).


At the same time, the processing of all these data is not expected to be carried out by a single data, but possibly shared with third-parties. Also, privacy laws require consent from the data subject before sharing information.

Alternatively, anonymization represents the most cost-effective in a compliant-manner. In fact, Recital 26, defining anonymized data as “data rendered anonymous in such a way that the data subject is not or no longer identifiable”, states that “this Regulation (GDPR) does not, therefore, concern the processing of such anonymous data”.

When looking for an anonymization solution that fits your needs, we recommend to ask yourself the following questions:

💡 What needs to be anonymized?

📷 How many images need to be anonymized?

⌛ Is anonymization time-critical?

☁️ Cloud vs. on-premise anonymization solution

📥 What type of images should be anonymized?

🔐 Data privacy and security due diligence

🤖 Automated blurring solution vs. manual labour

Want to know more about this topic? Check our complete checklist for image blurring.


  • Autonomous vehicles promise to disrupt the way we consume mobility.
  • AV generates, collects, analyzes and stores an immense and complex amount of data.
  • Consequently, privacy implications of data collection and use in vehicles must be addressed to protect consumers.
  • Legislation, privacy by design and anonymization may be employed to help protect personal information collected and stored by autonomous vehicles.

About Celantur

Celantur offers a fully-automated anonymization solution for images & videos to comply with privacy laws. Our technology automatically detects the objects to be anonymized and blurs them:

✅ We anonymize all kinds of RGB-imagery: planar, panorama images and videos

✅ Our cloud platform is capable of anonymizing around 200.000 panoramas per day and 90.000 video frames per hour.

✅ Industry-grade anonymization quality: detection rate up to 99%

Ask us Anything. We'll get back to you shortly

automotivedata protectiongdprenglish
Start Demo Contact Us

Latest Blog Posts

Using object tracking to combat flickering detections in videos

How to decrease the amount of flickering detections in videos with object tracking.

How to copy XMP metadata between JPEG images (again)

Copying XMP metadata between images isn't straightforward. Read how it's done correctly.

20x Faster Than NumPy: Mean & Std for uint8 Arrays

How to calculate mean and standard deviation 20 times faster than NumPy for uint8 arrays.