Library Dependencies and the Open Source Supply Chain Nightmare
It’s a bigger problem than is immediately apparent, and has the potential for hacks as big as Equifax and as widespread as SolarWinds.
The universal need for speed and lack of resource in commercial app development requires developers to use free open-source software libraries. The difficulty is that there is no easy way to manage the open-source vulnerabilities that get included via the libraries into the finished commercial app.
The size of this problem has been analyzed in the new Contrast Labs 2021 Open-Source Security Report. The study looked at tens of thousands of real-world applications and APIs from Contrast’s own telemetry – and found a potentially serious problem.
First of all, the average application contains 118 open-source libraries. Many of these contain vulnerabilities, but many of the vulnerabilities afford no risk since only 38 percent of the libraries included in an app are actually used by the app.
Inside the library, the vulnerability may be found in just one class. However, in Java libraries, for instance, only 32 percent of the classes contained are invoked by the application. It is more than possible, then, that the finished app uses a library containing a known vulnerability that is of zero risk.
This is complicated by ‘transitive dependencies’, where a function consciously required from one library might actually rely on different additional libraries – which may inadvertently, and possibly unknowingly, be called and included in the shipped app.
The result is that under-resourced teams need to manage vulnerabilities that may or may not be relevant within hundreds of libraries, possibly within many different apps, and always with the possibility that library updates may cause further downstream issues.
Teams have a choice between spending many precious hours determining whether their apps contain a library that needs to be updated and then updating it, or just as likely, simply ignoring the problem. The latter course seems to be quite popular, since the average library in use is 2.5 years old, and 6 percent are more than five years old.
Static Code Analysis (SCA) engines exist that can return a list of CVEs found in applications. Such legacy tools do not, however, differentiate between vulnerabilities in active and inactive libraries. Contrast’s study shows that 17 percent of Critical and Major CVEs in Java applications, 15 percent in .NET applications, and 80 percent in Node applications are in inactive libraries or classes. SCA tends to provide a high number of false positives in the search for active vulnerabilities, increasing the pressure on development teams to ignore the problem and not apply the update.
But the report (PDF) warns that aging libraries merely magnify problems for the future:
“Failure to keep libraries updated over time not only increases risk to an organization but also makes library updates much more difficult and time-consuming when they are finally done. When a library stays dormant in an application for multiple years, any new vulnerability is difficult to fix because so much code has been built over it.”
“It’s a devil’s bargain,” Contrast’s co-founder and CTO Jeff Williams told SecurityWeek, “because the farther you get behind, the harder it is to get back up to date. So, you accrue technical debt if you don’t keep your libraries patched. But commercial companies are focused on rolling out new features and they don’t want to do those library updates if they don’t absolutely have to.”
“It’s a big problem,” he added. “All of the apps we now love and depend on – online banking, shopping, healthcare, defense, government and so on – use these libraries. If a library contains a vulnerability and is used by the app, that vulnerability becomes part of the app and can be attacked.”
PRIMARY ROUTES FOR ATTACKERS
The Equifax breach of 2017 is the iconic example. The breach was achieved via an Apache licensed library called Object Graph Navigation language (OGNL) by exploiting a defect related to OGNL parsing error messages.
Open-source libraries offer two primary routes for attackers. The first is via a discovered vulnerability as in the Equifax breach, while the second is by introducing a vulnerability into the library source. The potential harm that could be caused by the second method is massive.
A huge problem is that there is no centralized order to open source. Anybody can create a new library and make it available – usually from one of many open-source repositories – from where it can be downloaded by anyone. Other coders can add to the original. If an attacker can introduce a vulnerability at this stage, it can be included in all the apps that use the library.
This can be many. The top 25 Node libraries are present in more than 90 percent of Node-using apps. This is not necessarily as dangerous as it may seem since many of the libraries are inactive within the apps – but nevertheless the most common active library is used in 42 percent of apps. A vulnerability introduced into this library – or merely discovered within it – will be included in all apps using the library making all the companies using the finished product vulnerable. This scenario is repeated to a lesser or greater degree within all categories of library.
Part of the problem caused by the lack of an overall controlling body is there is no simple way to track vulnerabilities and library updates. “There is no formal notification system,” said Williams. “You have to run some kind of tool that can identify all your libraries and then check them against a database and see if any of them are out of date or have a known vulnerability. That’ll give you a task list of things that you need to go update;” bearing in mind that the task list may still include a high number of false positives where your apps and your use of the libraries are not affected by the vulnerabilities.
There is still another complexity to add to the difficulty in securely managing open-source libraries – the existence of what is commonly known as ‘copyleft’ libraries. While most open-source software can be freely used, some have some form of copyright left within them. “In most cases,” explained Williams, “this merely requires that the copyright of the developers be acknowledged. But some – like those using the GPL license – go further. Under GPL, if you use the library you have to make the resultant code also open-source under the same terms.”
This means that you must make your own code free to use– which is somewhat incompatible with commercial software. “The guy that created it had a very sort of, utopian vision of how code should be free, and everyone should share, and if we all work together then this code will benefit humanity. But most companies aren’t comfortable with that. They don’t want to make the code they may be developing under a contract for another company publicly available. And so, lawyers are nervous about any code that has a GPL license coming into their organization.”
Strictly speaking, there is nothing incredibly difficult about managing vulnerabilities in the open-source supply chain. The problem is that it requires more man hours than most companies can provide, or better automation than most solutions deliver. The result is the problem is often ignored or only partially dealt with. Williams’ purpose is to shine a light on the whole issue, “so that companies can focus on what matters, and hopefully make smarter library decisions and keep us all a little safer.”