Assemblyline 101 - Open Source Malware Triage
Learn how to install and use Assemblyline, the open-source malware triage tool. This 101 includes an overview, deployment walkthrough, example use case, and resources.
While analysts can individually analyze files, that process takes time and may require a plethora of tools. Having a single source that provides an automated approach to initial analysis and detection mechanisms allows analysts to sift through noise and focus on files that require more attention. This is where Assemblyline, an open-source tool created by the Canadian Center of Cyber Security (CCCS), comes in. Assemblyline allows files to be scanned with various tools (called ‘services’) within the platform and for information about the files to be collected in one place. This blog explores:
- What is Assemblyline?
- Installing Assemblyline using Docker
- Maldoc analysis using Assemblyline
What is Assemblyline?
Assemblyline is an open-source malware detection tool that allows cybersecurity analysts to triage files within a single platform quickly. The tool consists of different modules called services that collect information about the file and can be used to alert on suspicious artifacts. The key benefit of a tool like Assemblyline is that it tags submissions with results from services as it is being analyzed and can detect duplicate submissions. Moreover, the tool assigns a score to each file based on the information collected. This score can be used to identify malicious files or files that may warrant further investigation.
Who should use Assemblyline? Assemblyline is ideal for security research and defense teams, threat researchers, and incident response professionals who need to automate and streamline the analysis, classification, and prioritization of malware samples. It is especially helpful for security teams handling large volumes of malware and seeking a scalable, customizable solution for efficient triage.
Services Available within Assemblyline
Services are modules available within Assemblyline that analyze the submitted file and extract items that may indicate maliciousness. Services fall under two categories: Assemblyline services and community services.
- Assemblyline services are services or modules bundled with the Assemblyline build and are maintained by the Assembyline development team.
- Community services have been created by the community to augment existing functionality.
Assemblyline Services
The table below contains an overview of some of the services maintained by the Assemblyline team. The complete list, along with links to the service manifests, are available here.
Additional Community Services
The following community services are listed within the Assemblyline documentation.
Details on how to build a community service are available here.
Installing and Configuring Assemblyline
Assemblyline can be deployed on a single instance or in a clustered environment. The way in which a team chooses to deploy Assemblyline depends on its objectives. CCCS claims that both deployment mechanisms have the same analysis capabilities, but clustered environments scale better whilst offering redundancy and failover capability.
Figure 1 below compares the features of the different deployment mechanisms:
Installation Steps
The instructions below are from the Docker Installation Guide for Assemblyline:
- Install Docker
- Configure Docker to use a larger address pool
- Setup Assemblyline
- Deploy Assemblyline
1. Install Docker
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg software-properties-common
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update -y
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo ln -s /usr/libexec/docker/cli-plugins/docker-compose /usr/local/bin/docker-compose
2. Configure Docker to use a larger address pool
Create/Edit /etc/docker/daemon.json and the following line:
"default-address-pools":
[
{"base":"10.201.0.0/16","size":24}
]
}
Restart docker with service docker restart
Check status of docker - service docker status
3. Setup Assemblyline
Download the Assemblyline Docker Compose files:
mkdir ~/git
cd ~/git
git clone https://github.com/CybercentreCanada/assemblyline-docker-compose.git
There are two types of deployments:
- Assemblyline only
- Assemblyline with ELK monitoring stack
The ELK monitoring stack can be used to track Assemblyline metrics.
The deployment steps are the same, except for which directory gets copied into the folder. The minimal_appliance
directory will only setup Assemblyline while the full_appliance
directory will also setup ELK for monitoring:
~/deployments
; users can deploy to other directories, given that the file system has sufficient space for the installation.mkdir ~/deployments
cp -R ~/git/assemblyline-docker-compose/minimal_appliance ~/deployments/assemblyline
cd ~/deployments/assemblyline
To deploy Assemblyline and the ELK stack for metrics use the code snippet below:
mkdir ~/deployments
cp -R ~/git/assemblyline-docker-compose/full_appliance ~/deployments/assemblyline
cd ~/deployments/assemblyline
This will move config files into the deployment directory ~/deployment/assemblyline
. The config/config.yaml
file is pre-configured for use with docker-compose and the .env
file contains all the default passwords.
4. Deploy Assemblyline
- Create a SSL Cert:
openssl req -nodes -x509 -newkey rsa:4096 -keyout ~/deployments/assemblyline/config/nginx.key -out ~/deployments/assemblyline/config/nginx.crt -days 365 -subj "/C=CA/ST=Ontario/L=Ottawa/O=CCCS/CN=assemblyline.local"
- Pull the required docker containers:
Use the commands:
cd ~/deployments/assemblyline
sudo docker-compose pull
- Build the docker containers using
sudo docker-compose build
- Pull services using
docker-compose -f bootstrap-compose.yaml pull
- Once all the services have been pulled, the service can be launched using the commands:
cd ~/deployments/assemblyline
sudo docker-compose up -d --wait
sudo docker-compose -f bootstrap-compose.yaml up
Once all the services have been created, the console will output the list of services that have been launched along with the docker IDs.
Once the docker containers have fully been stood up, the services are up and running and can be accessed through the GUI. The web interface should be accessible on 127.0.0.1:443
using the default credentials specified in the .env
located in ~/deployments/assemblyline
.
If the web interface is not reachable through that address, check the logs to ensure that services are up and running and check the docker process using docker ps
to see which port is being used by the nginx frontend.
Updating a Dockerized Assemblyline Instance
A Docker deployment of Assemblyline can be updated using the following commands:
cd ~/deployments/assemblyline
sudo docker-compose pull
sudo docker-compose build
sudo docker-compose up -d
Checking Logs
Assemblyline logs are separated into logs for the core components and logs for specific components.
For the core components:
cd ~/deployments/assemblyline
sudo docker-compose logs
For specific components:
cd ~/deployments/assemblyline
sudo docker-compose logs ui
MalDoc Analysis Example
One of the benefits of Assemblyline is that it keeps the results of multiple analyzers in one place, making it easy for analysts or responders to review results. In this example, we upload a Word document that uses remote template injection to download additional payloads.
The Word document is an agreement for enterprise services. When opened the file will connect to the hardcoded url in the relationship file _rels\document.xml.rels
and load content from there. The hardcoded URL is used to load an RTF file from an adversary-controlled domain as shown in Figure 13.
Once a user uploads the file to Assemblyline, it starts the analysis process where each service runs against the file and results are collated. The verdict is updated at the end of the analysis based on the information returned by the services.
Each submission is given its own unique identifier and the submission information shows details about the analysis features that have been selected. Users can choose which services to use in an effort to speed up scans and adjust priority.
Under the Submission Information
section, is Heuristics
which outlines the results of the analysis.
Here, the services identify an IOC that is part of a blocklist. Clicking on any of the heuristics will provide more information about the finding. In the case of Badlisted IOC, the results show that a domain that was within the document was part of the threatview.io domain blocklist.
Moreover, OLETools identify an external relationship within the document. The service identifies a hardcoded URL that would be used to establish a connection to a malicious domain.
Potential indicators of compromise are shown in a separate section on the submission page. The ‘Indicators of Compromise’ section can be used to quickly see any IPs, domains, or hashes related to the submitted file. In this example, the IOCs include the URL identified by OLETools and its domain. The tool also identifies several Microsoft URLs, but color-coded them green to indicate that they are not malicious.
When a user clicks on a particular IOC, the associated file will be highlighted. This can be used to streamline manual analysis flows by pinpointing which file a user should look into. Furthermore, the ’Files’ section highlights all the files identified within the sample. In our example, the Word document contains several XML files, one of which Assemblyline has flagged as malicious. Each file is extracted and run through the services for individual analysis. The results for these files can be viewed by clicking on the filename under the Files
section.
When we dig deeper into the extracted file named 9177f499.xml
we see where it originated from using the ancestry
service. This tree illustrates the relations to the original file submitted and any services that generated findings.
Assemblyline helps reduce the number of benign files that investigators spend time analyzing during the day. By running files through an automated pipeline of services, investigators can get a sense of what the file is doing prior to manual inspection and prioritize threats more effectively. This process, combined with the fact that the submission resultsare stored on a central platform, allows for the platform to serve as a single triage source for samples. For additional resources on Assemblyline and its capabilities, check out the references below.