I am currently on an epic road trip, driving from the Netherlands to South Africa Instagram .
. Follow along onThis post explains the reasoning and philosophy behind the ESP8266 IoT Framework. Since the framework is evolving over time, some of this post might be outdated. Find the latest on GitHub .
Fetching or posting data to the internet is one of the core tasks of an IoT device. Doing so over HTTP is implemented quite well in the default ESP8266 Arduino libraries, but for HTTPS requests things are more difficult. In this post I will discuss the most common approaches used by the community, and develop my own method to do arbitrary HTTPS requests in a secure way. This method will not require any specific certificates or fingerprints to be manually coded in the application.
HTTPS if a method to do a HTTP request over a TLS (formerly SSL) connection. By doing so, the data that is sent back and forth between your computer and the server is encrypted and protected. The good news is that this protocol can be used with the ESP8266 with the WiFiClientSecure class. The bad news is that the common methods to do so have some big disadvantages.
First I will show the two most common approaches, and next I will describe a generic solution to their problems.
Before diving into the details I will briefly explain the basic principles of secure HTTPS requests in layman's terms.
Basically, every website has a certificate. This certificate is issued by somebody, who is called a certification authority (CA). Each CA certificate can be issued by another CA which leads to the so called certificate chain. In the picture the chain is 3 certificates long, but in reality the length of the chain can be anything.
A certificate chain. Image from Wikipedia
The top level certificate is called a root certificate. This certificate is self-signed, which means that it can be trusted inherently. This is because only a few organisations can issue root certificates, and these are trusted to not offer fake or wrong certificates.
When you do a HTTPS request to a website from your browser, the browser will look at the certificate for the website, and validating if the certificate is indeed issued by its parent. This can be done because each certificate is signed with the private key of the upstream certificate. An explanation for dummies on public and private keys work can be found here .
When it is verified that the certificate is indeed issued by a trusted root CA issuer, it is verified that the domain in the certicate is the same as the actual domain. If that is true, we know that the server is who it claims to be, and a secure connection can be started
.These trusted root certificates are actually stored as part of your browser to be able to validate all other certificates. Each OS or browser stores a slightly different set of roughly 100-200 root certificates, which it knows can be trusted. This is called a certificate store, and this is exactly what I will apply on the ESP8266 later in this article. But first, let's start with the two most popular other approaches.
The method that is proposed in the official ESP8266 Arduino documentation is to extract the fingerprint of a site's certificate and store this in the code. The fingerprint is a hash of the certificate. Because it is very unlikely that a second certificate exists with the same hash, we know the website can be trusted if the hash is the same as the one we store.
const char* host = "https://api.github.com";
const char* fingerpr = "CF 05 98 89 CA FF 8E D8 5E 5C E0 C2 E4 F7 E6 C3 C7 50 DD 5C";
WiFiClientSecure client;
client.connect(host, httpsPort);
if (client.verify(fingerpr, host))
{
http.begin(client, host);
String payload;
if (http.GET() == HTTP_CODE_OK)
payload = http.getString();
}
else
{
Serial.println("certificate doesn't match");
}
This approach is simple because the certificate chain does not need to be validated, but has two main issues for me:
You could argue that secure connections are overkill for your application. I would be the first to admit that I prefer a pragmatic solution where possible. Imagine that all you want to do with your ESP8266 is to fetch the weather from the internet and display it in some form. Personally I would not mind to do this in an insecure way, since there are no real dangers.
But imagine the ESP8266 is controlling the lock on your door or a 3D printer which can heat up and catch on fire. Or think of the case where you are transfering personal information to or from some site or API. In these cases it is better to be safe than sorry, and the method in this section should not be used. Nevertheless, I will show it here:
const char* host = "https://api.github.com";
WiFiClientSecure client;
client.setInsecure(); //the magic line, use with caution
client.connect(host, httpsPort);
http.begin(client, host);
String payload;
if (http.GET() == HTTP_CODE_OK)
payload = http.getString();
So basically all you need to do is to add client.setInsecure() to your code and it will start the connection without validating the certificate.
With that out of the way, we finally get to the implementation I have chosen instead for my ESP8266 IoT Framework, which is placed in the fetch class.
const char* host = "https://api.github.com";
String payload;
if (fetch.GET(host) == HTTP_CODE_OK)
payload = http.getString();
fetch.clean();
Looks easy enough doesn't it
? So what happens behind the scenes?Basically the same thing as in a typical Browser. The ESP8266 contains a full store of all the trusted root certificates in PROGMEM memory. This takes roughly ~170 kB of flash memory at the moment, which in my case can easily be missed. This certificate store is generated automatically on building the sofware, no manual steps required. This also means that you can do secure HTTPS requests to any URL (so you could even configure or change a URL after the build).
You might think, but hey! These certificates will expire too. And this is true. The only difference with the fingerprints is that the validity of root certificates is much longer, and can be over 20 years. Whereas the fingerprint for some services can change every few months.
As a starting point I found a great but hidden example in the ESP8266 Arduino repo . This example contains a Python script that gets all the certificates from the Mozilla root certificate store and stores them as files. These files will then be uploaded to the SPIFFS and used during HTTPS requests. I adapted this example to be able to store the certificates in PROGMEM instead.
When a request is started, the certStore class will compare the hash of the certificate issuer with all the hashes of the stored root certificates. If there is a match, the correctness of the domain and other properties will be checked and the connection will be initialized.
In the default Arduino example these hashes for the stored certificates are generated in the certStore class. It seems more logical to me to do this directly in the Python script to save computing time on the ESP8266, so thats where I moved it. Furthermore I adapted the certStore class (GitHub ) to read the information from my PROGMEM variables rather than the file system.
The final Python script to generate the certificate store is shown below.
from __future__ import print_function
import csv
import os
import sys
from asn1crypto.x509 import Certificate
import hashlib
from subprocess import Popen, PIPE, call, check_output
try:
from urllib.request import urlopen
except:
from urllib2 import urlopen
try:
from StringIO import StringIO
except:
from io import StringIO
#path to openssl
openssl = "C:\msys32\usr\bin\openssl"
f = open("src/generated/certificates.h", "w", encoding="utf8")
f.write("#ifndef CERT_H" + "\n")
f.write("#define CERT_H" + "\n\n")
f.write("#include <Arduino.h>" + "\n\n")
# Mozilla's URL for the CSV file with included PEM certs
mozurl = "https://ccadb-public.secure.force.com/"
mozurl += "mozilla/IncludedCACertificateReportPEMCSV"
# Load the names[] and pems[] array from the URL
names = []
pems = []
dates = []
response = urlopen(mozurl)
csvData = response.read()
if sys.version_info[0] > 2:
csvData = csvData.decode('utf-8')
csvFile = StringIO(csvData)
csvReader = csv.reader(csvFile)
for row in csvReader:
names.append(row[0]+":"+row[1]+":"+row[2])
pems.append(row[30])
dates.append(row[8])
del names[0] # Remove headers
del pems[0] # Remove headers
del dates[0] # Remove headers
derFiles = []
totalbytes = 0
idx = 0
# Process the text PEM using openssl into DER files
sizes=[]
for i in range(0, len(pems)):
certName = "ca_%03d.der" % (idx);
thisPem = pems[i].replace("'", "")
print(dates[i] + " -> " + certName)
f.write(("//" + dates[i] + " " + names[i] + "\n"))
ssl = Popen([openssl,'x509','-inform','PEM','-outform','DER','-out', certName],
shell = False, stdin = PIPE)
pipe = ssl.stdin
pipe.write(thisPem.encode('utf-8'))
pipe.close()
ssl.wait()
if os.path.exists(certName):
derFiles.append(certName)
der = open(certName,'rb')
bytestr = der.read();
sizes.append(len(bytestr))
cert = Certificate.load(bytestr)
idxHash = hashlib.sha256(cert.issuer.dump()).digest()
# for each certificate store the binary data as a byte array
f.write("const uint8_t cert_" + str(idx) + "[] PROGMEM = {")
for j in range(0, len(bytestr)):
totalbytes+=1
f.write(hex(bytestr[j]))
if j<len(bytestr)-1:
f.write(", ")
f.write("};\n")
# for each hashed certificate issuer, store the binary data as a byte array
f.write("const uint8_t idx_" + str(idx) + "[] PROGMEM = {")
for j in range(0, len(idxHash)):
totalbytes+=1
f.write(hex(idxHash[j]))
if j<len(idxHash)-1:
f.write(", ")
f.write("};\n\n")
der.close()
idx = idx + 1
f.write("//global variables for certificates using " + str(totalbytes) + " bytes\n")
f.write("const uint16_t numberOfCertificates PROGMEM = " + str(idx) + ";\n\n")
# store a vector with the length in bytes for each certificate
f.write("const uint16_t certSizes[] PROGMEM = {")
for i in range(0, idx):
f.write(str(sizes[i]))
if i<idx-1:
f.write(", ")
f.write("};\n\n")
# store a vector with pointers to all certificates
f.write("const uint8_t* const certificates[] PROGMEM = {")
for i in range(0, idx):
f.write("cert_" + str(i))
os.unlink(derFiles[i])
if i<idx-1:
f.write(", ")
f.write("};\n\n")
# store a vector with pointers to all certificate issuer hashes
f.write("const uint8_t* const indices[] PROGMEM = {")
for i in range(0, idx):
f.write("idx_" + str(i))
if i<idx-1:
f.write(", ")
f.write("};\n\n#endif" + "\n")
f.close()
The generated header file is saved as certificates.h and included in the application. The python script is hooked into PlatformIO to be automatically executed before each build, automatically incorporating the latest version of the certificate store.
This post only contained some snippets of the code to explain the high level approach that was taken. The full implementation for the ESP8266 IoT framework is found on GitHub . The documentation for the fetch class can be found here .