Malicious URL fraud with Unicode
Malicious URL fraud with Unicode - difficult for reviewers and CIs to detect
When attackers replace letters in URLs with Unicode characters that look the same, it is difficult to detect. A new CI job provides a remedy.
In his blog, security researcher and curl maintainer Daniel Stenberg has drawn attention to a security problem caused by Unicode fraud that is difficult for reviewers, mergers and CI jobs to detect.
In his blog, Stenberg shows how an attacker replaces a common ASCII character in the code with an almost identical one from the Unicode table. This is not recognizable in the code editor, but results in a different URL, for example, behind which malicious code can be hidden. As an example, the blogger uses an Armenian g.
The number of possible mix-ups is large: the many similar characters can be listed on the Unicode.org website, here in the image using the example from heise.
Restrict Unicode
Although the diff view on GitHub shows a changed paragraph in red for the g replaced in the URL, no difference is visible to the human eye and a maintainer may be inclined to simply wave the change through. In contrast, Gitea, which specializes in code review, warns about the nature of the change: "This line has ambiguous unicode characters".
As a countermeasure, Stenberg's Curl project has added a special CI job that checks where Unicode is allowed and where it is not. According to Stenberg, GitHub has also taken on the problem and wants to fix it.