Posts

Showing posts from 2007

pacparser - a library to parse PAC files

As I mentioned earlier also, proxy auto-config (PAC) files are becoming more and more important for web proxy usage because of automation and ease of administration provided by them. Almost all popular browsers today support them. But, there is still a dearth of tools available for processing PAC files e.g. popular web software like curl, wget and python-urllib still don't take PAC file for proxy configuration. That was the problem I wanted to solve when I started to work on pacparser. Now it's ready in full glory - http://code.google.com/p/pacparser . From the release announcement: I am very pleased to announce the release of "pacparser" - a C library to parse proxy auto-config (PAC) scripts. Needless to say, PAC files are now a widely accepted method for proxy configuration management and almost all popular browsers support them. The idea behind pacparser is to make it easy to add this PAC file parsing capability to other programs. It comes as a shared C library

ladakh, land of peace and quiet - part II

Image
Time to continue the Ladakh story started in the last post . So we reached our guest house in Leh on Monday night at around 10 PM. Pankaj was bowled over by the beautiful smile of the receptionist and manager of the guest house, a simple country girl. Actually, she was cute :) She was daughter of the guest house owner. The whole guest house was run by family people only - gardening, managing, cooking, cleaning everything. The people there were really nice. They cooked food just for us even though the regular dinner time was already over. We had a good sleep that night. Next morning, after having Ladakhi breakfast (ladakhi bread, honey jam and butter), we went out to see Leh. Main market was about 20 min away from the guest house and the whole route was filled with the handicraft shops and scenic views on both the sides. We had lunch there in the market itself and came back. Then we again slept off in the afternoon. The 2-days travel was showing up on us finally. In the evening, we

ladakh, land of peace and quiet - part I

Image
Have you ever felt the power of space? When you feel that the space, just space around you, affects you strongly. Almost all of us have experienced it for short duration in some way or other, for example when we go to a temple. I felt it for a much longer duration. It happened to us when we visited Ladakh last month. By we, I mean Pankaj and I. For those who don't know, Pankaj and I are best buddies. So, we went to this land of peace and quiet. There are some obvious things that make Ladakh different from all other hill stations. Altitude so high that AMS (Acute Mountain Sickness) comes to you easily, different kind of people, and proximity to both Pakistan and China borders. But, there are some things which are not easy to imagine. Things like how can it calm you beyond your imagination. We were very excited about this trip. We decided to go by Manali-Leh road and come back by air. Our route was something like: Hyderabad -> Delhi -> Chandigarh -> Manali -> Leh ->

Hacking squid

In this post, I would like to talk about the recent fun I had with squid . It involved some troubleshooting and some hacking. Problem: Squid will stop responding after running for some random period of time, say 10 to 40 min and cpu usage will shoot up to 95-100%. I started with strace , but everything looked fine there. Then I tried ltrace and there I got the first clue. squid was comparing 2 strings in an infinite loop: strcmp("thumbnail.videoegg.com", "i12.ebaystatic.com") = -1 strcmp("thumbnail.videoegg.com", "i12.ebaystatic.com") = -1 strcmp("thumbnail.videoegg.com", "i12.ebaystatic.com") = -1 Looks like some bad 'for' loop. But, what part of code and why? It needed little more debugging to answer these questions. The squid binary that I was running was installed from a debian package and thus was stripped off debugging symbols. To fix that problem, I rebuilt the squid package with debugging information. On

Real Tail'ing in Python

or, finding last few lines in a file. Ok. So, last solution was not perfect. It just returned last line from a file. What about returning say 10 or may be more lines? Here is the modified Tail function to do that: def Tail(filepath, nol=10, read_size=1024): """ This function returns the last line of a file. Args: filepath: path to file nol: number of lines to print read_size: data is read in chunks of this size (optional, default=1024) Raises: IOError if file cannot be processed. """ f = open(filepath, 'rU') # U is to open it with Universal newline support offset = read_size f.seek(0, 2) file_size = f.tell() while 1: if file_size < offset: offset = file_size f.seek(-1*offset, 2) read_str = f.read(offset) # Remove newline at the end if read_str[offset - 1] == '\n': read_str = read_str[:-1] lines = read_str.split('\n') if len(lines) >= nol: # Got no

Tail'ing in Python

or, finding last line of a huge file.. How do you find the last line of a 2 GB log file from within your program? You don't want to go through the whole file, right? Right. What you want to do is, you want to start reading from end until you find a newline character. Here is how I did it in Python: def Tail(filepath, read_size=1024): """ This function returns the last line of a file. Args: filepath: path to file read_size: data is read in chunks of this size (optional, default=1024) Raises: IOError if file cannot be processed. """ f = open(filepath, 'rU') # U is to open it with Universal newline support offset = read_size f.seek(0, 2) file_size = f.tell() while 1: if file_size < offset: offset = file_size f.seek(-1*offset, 2) read_str = f.read(offset) # Remove newline at the end if read_str[offset - 1] == '\n': read_str = read_str[0:-1] lines = read_str.split(

pactester - a tool to test proxy auto-config (PAC) files

Hackers and Sysadmins :-) Google has recently released " pactester ", a tool to test proxy auto-configuration (PAC) files. Use of PAC files is becoming more and more common because of automation and ease of administration provided by them. Before pactester there was no "real" way to test the PAC files. We could tell whether this site will be accessible using this PAC file or not. But, we could not tell which proxy server will be used for a specific URL unless we examine the traffic using some network sniffer or check the access logs at the proxy server. Both of these ways were not very accessible and were time consuming. Of course, another way to test would be the manual inspection of PAC files, but again it's error prone and quite impractical for large and complex PAC files. Pactester resolves all these issues by simulating the browser behavior. It evaluates the PAC file in a JavaScript context and returns the proxy server for a specific URL using the PAC file