Posts tagged ·

apache

·...

Let’s Encrypt TLS certificate setup for Apache on Debian 7

Comments Off

Through Let’s Encrypt, anybody can now easily obtain and install a free TSL (or SSL) certificate on their web site. The basic use case for a single host is very simple and straight forward to set up as seen here. For multiple virtual hosts, it is simply a case of rinse and repeat.

On older distributions, a bit more effort is required. E.g. on Debian 7 (Wheezy), the required version of the Augeas library (libaugeas0, augeas-lenses) is not available, so the edits to the Apache config files have to be managed by hand. Furthermore, for transitioning from an old HTTP based server, you need to configure the redirects for any old links which still might hard code “http” in the URL. Finally, there’s some security decisions to consider when selecting which encryption protocols and ciphers to support.

Installation and setup

Because the installer has only been packaged for newer distributions so far, a manual download is required. The initial execution of the letsencrypt-auto binary will install further dependencies.

sudo apt-get install git
git clone https://github.com/letsencrypt/letsencrypt /usr/local/letsencrypt
 
cd /usr/local/letsencrypt
./letsencrypt-auto --help

To acquire the certificates independently of the running Apache web server, first shut it down, and use the stand-alone option for letsencrypt-auto. Replace the email and domain name options with the correct values.

apache2ctl stop
 
./letsencrypt-auto certonly --standalone --email johndoe@example.com -d example.com -d www.example.com

Unless specified on the command line as above, there will be a prompt to enter a contact email, and to agree to the terms of service. Afterwards, four new files will be created:

/etc/letsencrypt/live/example.com/cert.pem
/etc/letsencrypt/live/example.com/chain.pem
/etc/letsencrypt/live/example.com/fullchain.pem
/etc/letsencrypt/live/example.com/privkey.pem

If you don’t have automated regular backup of /etc, now is a good time to at least backup /etc/letsencrypt and /etc/apache2.

In the Apache config for the virtual host, add a new section (or a new file) for the TSL/SSL port 443. The important new lines in the HTTPS section use the files created above. Please note, this example is for an older Apache version, typically available on Debian 7 Wheezy. See these notes for newer versions.

# This will change when Apache is upgraded to >2.4.8
# See https://letsencrypt.readthedocs.org/en/latest/using.html#where-are-my-certificates
 
SSLEngine on
 
SSLCertificateFile /etc/letsencrypt/live/example.com/cert.pem
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
SSLCertificateChainFile /etc/letsencrypt/live/example.com/chain.pem

To automatically redirect links which have hard coded http, add something like this to the old port *.80 section.

#Redirrect from http to https
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R,L]

While editing the virtual site configuration, it can be useful to watch out for the logging format string. Typically the logging formatter “combined” is used. However, this does not indicate which protocol was used to serve the page. To show the port number used (which implies the protocol), change to “vhost_combined” instead. For example:

CustomLog ${APACHE_LOG_DIR}/example_com-access.log vhost_combined

To finish, optionally edit /etc/apache2/ports.conf, and add the following line to the SSL section. It enables multiple named virtual hosts over SSL, but will not work on old Windows XP systems. Tough luck.

<IfModule mod_ssl.c>
  NameVirtualHost *:443
  Listen 443
</IfModule>

Finally, restart Apache to activate all the changes.

apache2ctl restart

Verification and encryption ciphers

SSL Labs has an excellent and comprehensive online tool to verify your certificate setup. Fill in the domain name field there, or replace your site name in the following URL, and wait a couple of minutes for the report to generate. It will give you a detailed overview of your setup, what works, and what is recommended to change.

https://www.ssllabs.com/ssltest/analyze.html?d=example.com

Ideally, you’ll get a grade A as shown in the image below. However, a few more adjustments might be required to get there. It typically has to do with the protocols and ciphers the web server is configured to accept and use. This is of course a moving target as security and cryptography research and attacks evolve. Right now, there are two main considerations to make: All the old SSL protocol versions are broken and obsolete, so should be disabled. Secondly, there’s an attack on the RC4 cipher, but disabling that is a compromise, albeit old, between its insecurity and the “BEAST” attack. Thus, disabling RC4 now seems to be preferred.

Taking all this into account, the recommended configuration for Apache and OpenSSL as it stands excludes all SSL versions, as well as RC4 versions. This should result in a forward secrecy configuration. Again, this is a moving target, so this will have to be updated in the future.

To make these changes, edit the Apache SSL mod file /etc/apache2/mods-available/ssl.conf directly, or update the relevant virtual host site config file with the following lines.


SSLHonorCipherOrder on
 
SSLCipherSuite "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS !RC4 !ECDHE-RSA-RC4-SHA"
 
SSLProtocol all -SSLv2 -SSLv3

Restart Apache, and regenerate the SSL Labs report. Hopefully, it will give you a grade A.


 
 

Final considerations

Even with all the configuration above in place, the all-green TSL/SSL security lock icon in the browser URL bar, as seen below right, might be elusive. Instead a yellow warning like the on in the image to left might show. This could stem from legacy URLs which have hard coded the http protocol, both to the internal site and external resources like images, scripts. It’s a matter of either using relative links, excluding the protocol and host altogether, absolute site links, inferring the protocol by not specifying it, or hard coding it. Examples:

<img src="blog_pics/ssl_secure.png">
 
<img src="/blog_pics/ssl_secure.png">
 
<img src="//i.creativecommons.org/l/by-sa/3.0/88x31.png">
 
<img src="https://i.creativecommons.org/l/by-sa/3.0/88x31.png">

On a blog like this, it certainly makes sense to put in some effort to update static pages, and make sure that new articles are formatted correctly. However, going through all the hundreds of old articles might not be worth it. When they roll off the main page, the green icon will also show here.

 
 

Comments Off

The Do-It-Yourself Cloud

1 comment

“In the cloud”

The buzzword “cloud” seems to be here to stay for quite a lot longer. The problem is that it is rather ill-defined, and sometimes it is used to mean “on the Internet”, regardless of how or where a particular service or content is hosted.

It is not before we pick up further buzzwords that we can add some meaning to the term: Although there are even more terms used, I would like to focus on two of them: Infrastructure as a Service (IaaS), or what traditionally has been called “hosting”; virtual or dedicated machines which you can install and operate on OS root level with little or no oversight. Examples include your local hosting provider, and global businesses like Amazon EC2 and Rackspace.

Secondly, Software as a Service (SaaS), where you don’t write the software or maintain the system yourself. All it takes is to sign up for a service, and start using it. Think Google Apps, which includes GMail, Docs, Calendar, Sites and much more; or Salesforce, Microsoft Office 365, etc. Often these services are billed as “free”, with no financial cost to private users, and the development and operating costs of the provider is financed through various advertisement programs.

Black Clouds

The problem with the later model, Software as a Service, is that it can put many constraints on the user, including what you are allowed to do, say, or even make it difficult for you to move to another provider. In his 2011 essay “It’s the end of the web as we know it”, Adrian Short likens the later model to tenants: If you merely rent your home, there are many things you will not be allowed to do, or which you do not have control over. Short focuses on web hosting where using a service like Blogger will not let you control how links are redirected, or were you to move in the future, take those page-clicks with you onto your new site. The same goes for e-mail: If AOL decides that their e-mail service is not worth-while tomorrow, many people will lose e-mails with no chance to redirect. Or look at all the storage services which collapsed in the wake of the raid on MegaUpload. A lot of users are still waiting for FBI to return their files.

More recently, the security expert Bruce Schneier wrote about the same problem, but from a security perspective. We are not only tenants he claims, but serfs in a feudal system, where the service providers take care of all the issues around security for us, but in return our eye-balls are sold to the highest bidder, and again it is difficult to move out. For example, once you’ve invested in music or movies from Apple iTunes, it is not trivial to move to Amazon’s MP3 store; and if you’ve put all your contacts into Facebook, it is almost impossible to move to MySpace.

In early December, Julian Assange surfaced to warn about complete surveillance, and governments fighting to curb free speech. His style of writing is not always as straight to the point as one could wish for, but in between there is a clear message: Encrypt everything! This has spurred interesting discussion all over the Internet, with a common refrain: Move away from centralized services, build your own.

Finally, Karsten Gerloff, president of the Free Software Foundation Europe (FSFE), touced on the same theme in is talk at the LinuxCon Europe in Barcelona, in November 2012. He highlighted the same problems with centralised control as discussed above, and also mentioned a few examples of free software alternatives which distributes various services. More about those below.

Free Software

The stage is set then, and DIY is ready to become in vogue again. But where do you start, what do you need? If not GMail or Hotmail, who will host your e-mail, chat, and other services you’ve come to depend on? Well, it is tempting to cut the answer short, and say: “You”. However, that does not mean that every man, woman and child has to build their own stack. It makes sense to share, but within smaller groups and communities. For example, it is useful to have a family domain, which every family member can hinge their e-mail address off. A community could share the rent of a virtual machine, and use it for multiple domains for each individual group; think the local youth club, etc. The French Data Network (FDN), has a similar business model for their ISP service, where each customer is an owner of a local branch.

For the software to provide the services we need in our own stack, we find ourselves in the very fortunate situation that it is already all available for free. And it is not only gratis, it is free from control of any authority or corporation, free to be be distributed, modified, and developed. I’m of course talking about Free and Open Source Software (FOSS), which has much to thank Richard Stallman for its core values, defined in the GPL. (“There isn’t a lawyer on earth who would have drafted the GPL the way it is,” says Eben Moglen. [“Continuing the Fight“]). We may take it for granted now, however, we could very easily have ended up in a shareware world, where utilities of all kinds would still be available, but every function would come with a price tag, and only the original developers would have access to the source code, and be able to make modification. Many Windows users will probably recognize this world.

Assuming one of the popular GNU/Linux distributions, most of the software below should already be available in the main repositories. Thus it is a matter of a one-line command, or a few clicks to install. Again a major advantage of free software. Not only is it gratis, it usually refreshingly simple to install. The typical procedure of most proprietary software would include surfing around on an unknown web site for a download link, downloading a binary, and trusting (gambling really) that it has not been tempered with. Next, an “Install Wizard” of dubious usefulness and quality gives you a spectacular progress bar, sometimes complete with ads.

The DIY Cloud

The following is a list of some of the most common and widely used free and open source solutions to typical Internet services, including e-mail, web sites and blogging, chat and voice and video calls, online calendar, file sharing and social networks. There are of course many other alternatives, any this is not meant to be an exhaustive list. It should be plenty to get a good personal or community services started, though.

  • The Apache HTTP web server is the most widely used web server on the Internet, powering shy of 60% of web sites (October 2012). It usually comes as a standard package in most distributions, and is easy to start up and configure. For the multi-host use-case, it is trivial to use the same server for multiple domains.
  • If you are publishing through a blog like this one, the open source WordPress project is natural companion to the Apache web server. It too is available through standard repositories, however, you might want to download the latest source and do a custom install, both for the security updates, and to do custom tweaks.
  • For e-mail, Postfix is typical choice, and offers easy setup, multi-user and multi-domain features, and integrates well with other must-have tools. That includes SpamAssassin (another Apache Foundation project) and Postgrey to handle unwanted mail, and Dovecot for IMAP and POP3 login. For a web-frontend, SquirrelMail offers a no-frills fully featured e-mail client. All of these are available through repository install.
  • Moving into slightly less used software, but still very common services, we find the XMPP (aka Jabber) servers ejabberd and Apache Vysper, with more to choose from. Here, a clear best-of-breed has yet to emerge, and furthermore, it will require a bit more effort on the admin and user side to configure and use. As an alternative, there is of course always IRC, with plenty of software in place.
  • Taking instant chat one step further, a Voice-over-IP server like Asterix is worth considering. However, here setup and install might be tricky, and again, signing up / switching over users might require more effort. Once installed, though, there are plenty of FOSS clients to choice from, both on the desktop and mobile.
  • Moving on to more business oriented software, online calendar through the Apache caldav module is worth exploring. As an alternative the Radicale server is reported to be easy to install and use.
  • A closely related standard protocol, WebDav, offers file sharing and versioning (if plain old FTP is not an option). Again, there is an Apache module, mod_dav, which is relatively easy to set up, and access in various ways, including from OSX and Windows.
  • DIY Internet

    That list should cover the basics, and a bit more. To round it off, there are a number of experimental or niche services which is worth considering to their propitiatory and closed alternatives. For search, the distributed YaCy project looks promising. GNU Social and Diaspora aim to taken on heavy weights in social networking. Finally, GNUNet and ownCloud are peer-to-peer file-sharing alternatives.

    The future lies in distributed services, with content at the end-nodes, rather than the hubs. In other words, a random network, rather than scale-free. Taking that characteristic back to the physical layer (which traditionally always has been scale-free), there are “dark nets” or mesh nets, which aim to build an alternative physical infrastructure based on off-the-shelf WiFi equipment. Currently, this at a very early experimental state, but the trend is clear: Local, distributed and controlled by individuals rather than large corporations.

Analysing Apache Logs: gnuplot and awk

Comments Off

The Apache http logs

I wanted to make a graph on the amount of data served from by Apache server, with a bit finer granularity than AWStats could give. The http_access file has all the information I needed, including the time of each request and bytes served. Assuming the standard combined format, the time stamp is at the 4th field, and the bytes served at the 10th.

Thus, the following will isolate the necessary data for my graph. (Note, the log can usually be found at /var/log/httpd/access_log).

cat /tmp/access | cut -f 4,10 -d ' '

However, it turns out not all log entries store the bytes served. This includes file not found, and certain requests which return no data. Some cases will have a hyphen, while others will simply be blank. To pick out only the lines which contained data, I appended the line above with:

cat /tmp/access | cut -f 4,10 -d ' ' | egrep ".* [0-9]+"

The first plot

This is enough to start working with in gnuplot. First we have to set the time format of the x-axis. The Apache log file is on this format: “[10/Oct/2000:13:55:36″, or in terms of strftime(3) format: “[%d/%b/%Y:%H:%M:%S”. (Note that the opening bracket from the log is included in the formatting string).

To set the time format in gnuplot, and furthermore specify that we work with time on the x-axis:
set timefmt "[%d/%b/%Y:%H:%M:%S"
set xdata time

The data can then be plotted with the following command:
plot "< cat /tmp/access | cut -f 4,10 -d ' ' | egrep '.* [0-9]+'" using 1:2

To output to file, the following will do. The graph below shows the served files from my logs in the last couple of days.
set terminal png size 600,200
set output "/tmp/gnuplot_first.png"

First plot

Improvements
There are a few improvements to be made on the graph above: Most importantly the data is slightly misleading, since files served at the same time is not accumulated. Furthermore, the aesthetics like legend, axis units, and title formatting are missing. Also note that the graph is scaled to a few outliers: I have a 7 MB video on my blog, which is downloaded occasionally. For the following examples, I will focus on the first day, where this file is not included.

First, I’ve made some minor improvements, and in the second graph I’ve applied the “frequency” smoothing function. Notice how the first graph has a maximum around 440 kb, while the smoothed and accumulated graph below peaks at around 900.
set terminal png size 600,250
set xtics rotate
set xrange [:"[24/Apr/2011:22"]

plot "< cat /tmp/access | cut -f 4,10 -d ' ' | egrep '.* [0-9]+'" using 1:($2/1000) title "kb" with points

plot "< cat /tmp/access | cut -f 4,10 -d ' ' | egrep '.* [0-9]+'" using 1:($2/1000) title "kb" smooth frequency with points

Improved

Frequency smoothing

awk
Although the frequency smoothing function gives an accurate picture, some of the accumulations are done at a too wide range, thus giving the impression of higher load than is the case. Another way to sum up the data is to aggregate all request on the same second into a sum. This can be done with the following awk script:

awk '{ date=$1; if (date==olddate) sum=sum+$2; else { if (olddate!="") {print olddate,sum}; olddate=date; sum=$2}} END {print date,sum}'

The input still has to be scrubbed, so the final line looks like this:
cat /tmp/access | cut -f 4,10 -d ' ' | egrep ".* [0-9]+$" | awk '{ date=$1; if (date==olddate) sum=sum+$2; else { if (olddate!="") {print olddate,sum}; olddate=date; sum=$2}} END {print date,sum}' > /tmp/access_awk

Plotting these two functions in the same graphs shows the difference between the peaks of the frequency function, and the simple aggregation:
plot "< cat /tmp/access | cut -f 4,10 -d ' ' | egrep '.* [0-9]+'" using 1:($2/1000) title "frequency" smooth frequency with points, "/tmp/access_awk" using 1:($2/1000) title "awk" with points lt 0

awk and frequency smoothing

Moving average in Gnuplot
For the daily graph, I think I’d prefer the one using the awk output, and perhaps using lines or “impulses” as style instead. However, it does not address the outliers. To smooth them out, we could try a moving average. This is not supported by any native function in gnuplot, so we have to roll our own. Thanks to Ethan A Merritt, there is an example of this.

Of course, this will put a lot less emphasis on peaks, and the outlier at 650 kb in the graphs above is now represented with a spike of less than 200. Furthermore, there is a problem with the moving average of time data of inconsistent frequency. The values will be the same whether the last five request were over an hour or a few seconds.

samples(x) = $0 > 4 ? 5 : ($0+1)
avg5(x) = (shift5(x), (back1+back2+back3+back4+back5)/samples($0))
shift5(x) = (back5 = back4, back4 = back3, back3 = back2, back2 = back1, back1 = x)
init(x) = (back1 = back2 = back3 = back4 = back5 = 0)

plot init(0) notitle, "/tmp/access_awk" using 1:(avg5($2/1000)) title "awk & avg5" with lines lt 1

moving avg

Zooming out to the day view, the average is maybe more appropriate here, since data is overall on a more consistent frequency.

set xrange [*:*]
set format x "%d"

plot "/tmp/access_awk" using 1:(avg5($2/1000)) title "awk & avg5" with lines

full graph, moving avg

Cumulative
Finally, another interesting view is the cumulative output day by day. This can easily be achieved by inserting a blank line in the data file between each day. In awk, using the previous sum file generated above, it can be done like this:

cat /tmp/access_awk | awk 'BEGIN { FS = ":" } ; { date=$1; if (date==olddate) print $0; else { print ""; print $0; olddate=date}}' > /tmp/access_awk_days

Or an alternative, based on the original access_log file. The aggregation per second this not necessary, since the “cumulative” function will do the same operation, and the graph will be exactly the same:
cat /tmp/access | cut -f 4,10 -d ' ' | egrep ".* [0-9]+$" | awk 'BEGIN { FS = ":" } ; { date=$1; if (date==olddate) print $0; else { print ""; print $0; olddate=date}}' > /tmp/access_awk_days

And the gnuplot. Note that the tics on the x-axis are set manually here, starting on a day before the first day in plot, and ending on the last. The increment is set to a bit less than a day in seconds (60 * 60 * 24 = 86400) to approximately center it under each line. Also note, that the format of the start and end arguments still have to be the same as set in the beginning, with timefmt.
set xtics "[23/Apr/2011:0:0:0", 76400, "[29/Apr/2011:23:59:59"
set format x "%d"

plot "/tmp/access_awk_days" using 1:($2/1000000) title "cumulative (MB)" smooth cumulative

cumulative

Comments Off