Scalable CI/CD with GitLab & Kubernetes
Explore setting up a scalable CI/CD pipeline with GitLab and Kubernetes, involving GitLab Runners, Helm charts, and deploying microservices to GKE or EKS.
Discover how to set up Prometheus and Grafana on an EC2 instance for monitoring. This guide covers installation, configuration, node exporters, and dashboards.
In the world of cloud computing, monitoring is a vital component for maintaining system health and ensuring effective alerting mechanisms. Amazon's EC2 instances, like any other infrastructure, require robust monitoring to track performance metrics such as CPU usage, memory consumption, and network I/O. This is where tools like Prometheus and Grafana come into play. Prometheus is an open-source systems monitoring and alerting toolkit, while Grafana provides a powerful platform for data visualization and analytics. Together, they form a comprehensive solution for EC2 monitoring.
Setting up Prometheus on an EC2 instance involves several steps. Firstly, you need to install Prometheus on your EC2 instance. This can be done by downloading the latest release from the Prometheus website. Once installed, configure the Prometheus server by editing the prometheus.yml
file to scrape metrics from your EC2 instances. This typically involves specifying the job name and target EC2 instance IPs or DNS names. Ensure that your security groups allow Prometheus to access the necessary ports.
After setting up Prometheus, the next step is to install Grafana. Download and install Grafana on the same or a separate EC2 instance, depending on your architecture preferences. Once installed, you can access Grafana via its web interface and add Prometheus as a data source. This will enable you to create custom dashboards to visualize your EC2 metrics. Grafana's intuitive interface allows you to build dashboards using pre-defined templates or custom queries, providing insights into your EC2 instance's performance in real time.
To begin installing Prometheus on an EC2 instance, first ensure you have an AWS account and an EC2 instance running a compatible Linux distribution, such as Amazon Linux 2 or Ubuntu. Start by connecting to your EC2 instance using SSH. Once connected, update your package manager to ensure all packages are up-to-date. This can be done using the following command:
sudo yum update -y # for Amazon Linux
sudo apt-get update -y # for Ubuntu
Next, download the latest version of Prometheus from the official Prometheus download page. You can use the `wget` command to fetch the tarball directly to your EC2 instance. Once downloaded, extract the files and move the Prometheus binaries to a directory in your system's PATH, such as `/usr/local/bin`:
wget https://github.com/prometheus/prometheus/releases/download/v2.31.1/prometheus-2.31.1.linux-amd64.tar.gz
tar xvfz prometheus-2.31.1.linux-amd64.tar.gz
sudo mv prometheus-2.31.1.linux-amd64/prometheus /usr/local/bin/
sudo mv prometheus-2.31.1.linux-amd64/promtool /usr/local/bin/
After moving the binaries, you need to set up a directory for Prometheus configuration files and data. Create a new directory called `/etc/prometheus` for configuration files and `/var/lib/prometheus` for data storage. Then, copy the default configuration file to the new config directory and modify it to include your EC2 instance as a target:
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
sudo cp prometheus-2.31.1.linux-amd64/prometheus.yml /etc/prometheus/prometheus.yml
Finally, you can start Prometheus as a background service. To ensure Prometheus starts automatically on system boot, create a systemd service file. This file should define the location of the Prometheus binary, configuration, and data directory. Once the service file is created, enable and start the Prometheus service:
sudo nano /etc/systemd/system/prometheus.service
Add the following content to the service file:
[Unit]
Description=Prometheus
[Service]
ExecStart=/usr/local/bin/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus
[Install]
WantedBy=multi-user.target
Enable and start the service:
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
Visit `http://
To effectively configure Prometheus for monitoring, start by installing Prometheus on your EC2 instance. First, download the latest version of Prometheus from the official Prometheus download page. Extract the downloaded tarball and navigate to the Prometheus directory. Here, you'll find the prometheus.yml
configuration file, which is crucial for defining your monitoring targets and scrape intervals.
Open the prometheus.yml
file in a text editor and configure it to monitor your EC2 instances. You need to specify the job name and the target EC2 instances' IP addresses or DNS names. For example:
scrape_configs:
- job_name: 'ec2_instances'
static_configs:
- targets: [':9100']
Save the file and start Prometheus by executing the ./prometheus
command. Ensure that port 9090 is open on your EC2 instance's security group for Prometheus to function properly. You can verify Prometheus is running by accessing http://
in your web browser. This setup allows Prometheus to collect metrics from specified targets, which you can later visualize in Grafana.
Setting up Node Exporters is a crucial step in enabling Prometheus to collect metrics from your EC2 instances. Node Exporters act as agents that expose hardware and OS metrics in a format that Prometheus can scrape. To begin, you'll need to install Node Exporter on each EC2 instance you wish to monitor. First, download the latest version of Node Exporter from the official Prometheus download page. Ensure you choose the correct binary for your operating system.
Once downloaded, extract the files and move the Node Exporter binary to a directory in your system's PATH, such as /usr/local/bin
. You can then start Node Exporter using a simple command. For instance, run the following command to start the service:
./node_exporter
Ensure you configure your firewall rules to allow Prometheus to scrape metrics from the Node Exporter's default port, which is 9100.
To automate the start of Node Exporter on system reboot, consider creating a systemd service. Create a file named /etc/systemd/system/node_exporter.service
with the following content:
[Unit]
Description=Node Exporter
[Service]
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=default.target
After saving the file, enable and start the service with:
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
This setup ensures Node Exporter runs continuously, providing consistent metrics for Prometheus to collect.
To install Grafana on an EC2 instance, you first need to ensure that your server is updated and has the necessary dependencies. Begin by connecting to your EC2 instance via SSH. Once connected, update your package index to ensure you have access to the latest software:
sudo yum update -y
Next, you'll need to set up the Grafana repository. This can be done by creating a new repository file in the /etc/yum.repos.d/ directory. Use the following command to add the Grafana repository:
sudo tee /etc/yum.repos.d/grafana.repo <
With the repository in place, you can now install Grafana using the yum package manager. Execute the following command to install Grafana:
sudo yum install grafana -y
Once Grafana is installed, you need to start the Grafana service and enable it to run on boot. This ensures that Grafana is always available to provide monitoring dashboards:
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
Grafana runs on port 3000 by default. Open this port in your EC2 security group settings to allow incoming traffic. You can access the Grafana web interface by navigating to http://your-ec2-public-ip:3000 in your web browser. The default login credentials are 'admin' for both the username and password, which you should change upon first login for security reasons.
Once you have both Prometheus and Grafana installed and running on your EC2 instance, the next step is to connect Grafana to Prometheus. This integration allows you to visualize the data collected by Prometheus using Grafana's powerful dashboard capabilities. Start by logging into Grafana, typically accessible via http://
. Use the default credentials admin
for both username and password, and remember to change them after the first login for security reasons.
In the Grafana interface, navigate to the "Data Sources" section from the sidebar and click on "Add data source." Choose "Prometheus" from the list of available data sources. You will need to configure the Prometheus server URL, which should be http://localhost:9090
if Prometheus is running on the same EC2 instance. After entering the URL, click "Save & Test" to ensure Grafana can connect to Prometheus. If successful, you will see a confirmation message.
With Prometheus successfully added as a data source, you can now create custom dashboards to monitor your EC2 instances. Go to the "Dashboards" section and click "New Dashboard." Use Grafana's query editor to select the metrics you want to visualize. You can create a variety of graphs, alerts, and panels based on the data collected by Prometheus. For more detailed guidance, consult the official Grafana documentation.
Creating custom Grafana dashboards allows you to visualize the data collected by Prometheus in a way that is most meaningful to your operations team. Grafana provides a user-friendly interface to build these dashboards using a variety of panels such as graphs, tables, and heatmaps. To start creating a dashboard, log into your Grafana instance and click on the "+" icon in the sidebar, then select "Dashboard". From here, you can add panels by selecting "Add New Panel" and configuring the data source and visualization type.
Each panel can be customized with queries that pull data from Prometheus. Use the built-in query editor to craft PromQL queries that fetch the specific metrics you need. You can also apply functions and transformations to your data for more refined analysis. Consider organizing your panels into rows to group related metrics, and use templating to create dynamic dashboards that adapt to different variables, such as instance names or time ranges.
Once your panels are set, you can further enhance your dashboard by adding annotations for important events or alerts. These can be integrated with Grafana's alerting system to provide real-time notifications. To share your dashboard with team members, use the "Share" button to generate a link or export it as JSON. For more detailed guidance on creating dashboards, refer to the Grafana Documentation.
Setting alerts in Grafana is a pivotal step to ensuring that you are promptly notified of any anomalies or critical issues in your EC2 environment. Alerts in Grafana work by evaluating your queries at regular intervals and triggering notifications when certain conditions are met. To set up alerts, navigate to the dashboard you have created for monitoring and click on the panel where you want to add an alert. Ensure that your data source is correctly set to Prometheus, as this is where the metrics will be evaluated.
Once you are in the panel settings, switch to the "Alert" tab. Here, you can define the alert conditions. For example, you might want to trigger an alert if CPU usage exceeds 80% for five minutes. Use the "Create Alert" button to start configuring your alert rule. Specify the evaluation interval and conditions. You can set multiple conditions if necessary. After defining the alert, set up a notification channel to receive alerts via email, Slack, or other supported services. This ensures that you are informed immediately, minimizing downtime.
Grafana also allows you to manage alert rules efficiently. You can view all alerts in the "Alerting" section, where you can enable, disable, or delete them as needed. By organizing your alerts and ensuring they are meaningful and actionable, you can maintain robust monitoring of your EC2 instances. For a comprehensive guide on setting up alerts in Grafana, refer to the official Grafana documentation.
Once you have completed the setup of Prometheus and Grafana on your EC2 instance, it's crucial to test and validate the configuration to ensure that monitoring is correctly implemented. Start by verifying that the Prometheus service is running. You can do this by accessing the Prometheus web interface through your browser. Navigate to http://
. This should display the Prometheus dashboard, indicating that the server is active and functioning.
Next, ensure that Prometheus is correctly scraping metrics from the node exporter. In the Prometheus dashboard, click on the "Status" menu and select "Targets." Here, you should see your EC2 instance listed as a target with the status "UP." If it's not, revisit your Prometheus configuration file to ensure the node exporter's IP and port are correctly specified. For more guidance on troubleshooting, consider checking the official Prometheus documentation.
After confirming Prometheus is operational, validate that Grafana is properly connected and can visualize the metrics. Log into your Grafana instance by navigating to http://
. Create a new dashboard and add a graph panel. Choose Prometheus as the data source and input a query like node_cpu_seconds_total
to see CPU usage over time. If the graph displays data, your setup is complete. If not, ensure the data source settings in Grafana are correctly pointed to your Prometheus instance.
To ensure effective EC2 monitoring with Prometheus and Grafana, it's important to follow best practices that enhance the accuracy and reliability of your monitoring setup. Start by ensuring that your Prometheus server is configured to scrape metrics at regular intervals. This involves setting an appropriate scrape interval that balances the need for up-to-date data with the overhead of frequent data collection. Typically, a scrape interval of 15 to 30 seconds is recommended for most environments.
Additionally, make use of Prometheus' alerting features to set up alerts for key metrics. This involves configuring alert rules in Prometheus to notify you of any anomalies or critical thresholds that are breached. It's also advisable to integrate these alerts with a notification system like Slack or PagerDuty to ensure timely responses. For more detailed guidance on setting up alerts, refer to the Prometheus Alertmanager documentation.
When visualizing your data in Grafana, create dashboards that are intuitive and focus on the most critical metrics for your EC2 instances. Consider using templating features in Grafana to make your dashboards reusable across different instances or environments. Additionally, ensure that your dashboards are regularly reviewed and updated to reflect any changes in your infrastructure or monitoring requirements. This proactive approach will help maintain the relevance and usefulness of your monitoring setup over time.