Malicious URL fraud with Unicode

Malicious URL fraud with Unicode - difficult for reviewers and CIs to detect

When attackers replace letters in URLs with Unicode characters that look the same, it is difficult to detect. A new CI job provides a remedy.

In his blog, security researcher and curl maintainer Daniel Stenberg has drawn attention to a security problem caused by Unicode fraud that is difficult for reviewers, mergers and CI jobs to detect.

In his blog, Stenberg shows how an attacker replaces a common ASCII character in the code with an almost identical one from the Unicode table. This is not recognizable in the code editor, but results in a different URL, for example, behind which malicious code can be hidden. As an example, the blogger uses an Armenian g.

The number of possible mix-ups is large: the many similar characters can be listed on the Unicode.org website, here in the image using the example from heise.

Restrict Unicode

Although the diff view on GitHub shows a changed paragraph in red for the g replaced in the URL, no difference is visible to the human eye and a maintainer may be inclined to simply wave the change through. In contrast, Gitea, which specializes in code review, warns about the nature of the change: "This line has ambiguous unicode characters".

As a countermeasure, Stenberg's Curl project has added a special CI job that checks where Unicode is allowed and where it is not. According to Stenberg, GitHub has also taken on the problem and wants to fix it.

Found on https://www.heise.de/news/Neue-Angriffsmasche-auf-GitHub-und-Co-Zeichentausch-mit-Unicode-in-URLs-10387719.html