vivek's blog

Algo Puzzle - Product of Elements of an Array

I came across this interesting problem of coming up with an algorithm that would compute, for each element of an array, the product of all other elements in that array, in linear time without using the division operator (or implementing a division algorithm.)

A solution is to traverse left to right computing the growing product, and then traverse right to left computing the growing product and multiplying it with the result from the previous traversal. I.e.,

Given,
P     Q    R    S
Compute,
1     P    PQ   PQR
x     x    x    x 
QRS   RS   S    1
-------------------
QRS   PRS  PQS  PQR  

Implemented in python,

def elProd( A ):
        result = []
        prod = 1

        # for i from 0 up to len(A) - 1
        for i in range( len( A ) ):
                result.append( prod )
                prod = prod * A[ i ]
        prod = 1
        
        # for i from len(A) - 1 down to 0
        for i in range( len( A ) - 1, -1, -1 ):
                result[ i ] = result[ i ] * prod
                prod = prod * A[ i ]

        return result

Inventor: W. Daniel Hillis

I have always dreamed of becoming an Inventor, have a lab of my own with all sorts of esoteric gadgets and all the time in the world to create whatever I want. At 46, Daniel Hillis is doing exactly that. An American inventor, he is living his dream - conjuring up whacky ideas and implementing them in his uber-cool lab. He is also an author, a scientist, and an engineer.

The Two Schools of Cryptography

Technology Research News (TRN) is featuring an article on the two schools of cryptography, based on the two basic methods of cryptography, namely, computational and probabilistic.

Two schools of cryptography (TRN How it works)

Threads Cannot Be Implemented as a Library

A paper by Hans Boehm demonstrates why vicarious multithreading, through library implementations (such as Pthreads), in languages like C and C++ can never be 100% correct.

UNIX Swapping

I was reading though a threaded discussion over at Kerneltrap.org, under the post "Swap Pre-Fetching" and found this really neat explanation of how swapping relates to files and inodes. Especially enlightening as to how a file backed (mmap()) swapping system works if the underlying executable itself is upgraded when the executable is running.

Allow me to elaborate. UNIX filesystems have a concept of "inodes" that store the body of the file, its permissions and its ownership. The inodes get linked into directories via names--aka. directory entries. The same inode can be linked into the filesystem in multiple places. (Hence the concept of a "hard link.") The filesystem keeps track of how many links an inode has, and the kernel keeps track of how many processes have opened a given inode. This concept is important, and I will come back to it.

When an executable runs, the executable's file as well as the files for all the libraries it depends on get opened. The pages for these files get mmap()'d into the process' address space as file-backed virtual memory. The memory gets marked copy-on-write, so that any changes to the mmap()'d code result in a fault, and break the file backing. In any case, the file-backed portions are backed by the contents of the inodes themselves.

Under virtual memory pressure, the kernel will have to deallocate physical pages of memory from some processes in order to allocate them to others. There are two strategies available here: Write dirty pages to swap, and discard clean pages. Clean pages are pages which have either an explicit file backing (such as program executable pages), and pages that were previously swapped, brought back in, but still have an equivalent copy in the swap partition. (This is sometimes refered to as the "swap cache," though I don't know if that designation is accurate.)

So yes, under memory pressure, some pages of an executable might get discarded and will need to be brought in later from the original executable. The grandparent wonders how that works if a user upgrades a binary while the executable runs.

Recall that there's the separation between the file's contents (the inode) and the name given to it in the file system (hard link to the inode). File descriptors are bound to inodes, not directory entries. When you "rm" a file, you remove the link between the directory and the inode. When you replace a file, say with "cp," the existing inode gets unlinked and a new inode gets linked in its place. When you "mv" a file, it gets linked in its new location, and unlinked from its old location.

The filesystem code does not reclaim the space allocated to the inode until all references to the inode drop. This includes all filesystem links and open file descriptors. Thus, when you replace a program's executable while it executes, the currently running program continues to see the old executable, even if the inode doesn't have a visible link in the filesystem. The inode will remain allocated until all of its open file descriptors get closed. Then and only then will the filesystem reclaim the storage associated with the inode.

In fact, it is this property of UNIX derived filesystems that leads to all the orphaned inodes you find in "lost+found/" after a fsck if your system gets shut down abruptly. Any inodes that were open at the point of the crash, but which did not have a hard directory link end up here.

Introduction to the Xen Virtual Machine

Xen is the new kid in the virtualization arena, receiving well deserved attention from the industry and the academia, with some really big players betting on it.

Xen is an open source virtual machine monitor, or hypervisor, developed by the University of Cambridge. It has a design goal of being able to run 100 full featured OS instances on a single typical computer. Xen provides secure isolation, resource control, quality of service guarantees, and live migration of virtual machines.

Linux Journal is featuring an introductory article on Xen and its design. A nice, moderately technical read: Introduction to the Xen Virtual Machine

Swarm Intelligence - Presentation Slides

Nandagopal presented a small seminar on Swarm Intelligence at my college a few days back. A technique in Artificial Intelligence (his research interest), Swarm Intelligence is based around the study of collective behaviour in decentralised, self-organised, systems. I have attached the presentation slides to this post. Enjoy!

Mobile Mesh Networking

A Wireless LAN (WLAN) is a Wireless Local Area Network that uses radio waves as its carrier. The backbone network is usually wired and provides one or more wireless access points connecting the wireless devices to the wired network. So, communication distance is often constrained by the availability and location of access points. A Wireless Mesh Network on the other hand relies on all the nodes in the network to facilitate communication, thus extending the transmission distance upto the farthest node, where each node has atleast one reachable neighboring node that is closer to the base station. Since each node acts like a repeater in mesh networks, the more the number of nodes, the more the bandwith and the stronger the signal that reaches the access point.

If you are looking for a definition of a Mesh Network -

A mesh network is a network that employs one of two connection arrangements, full mesh topology or partial mesh topology. In the full mesh topology, each node is connected directly to each of the others. In the partial mesh topology, nodes are connected to only some, not all, of the other nodes.

Last week, PacketHop released its TrueMesh mobile mesh networking software and a suite of multimedia applications that provide instant wireless group communication. According to InformationWeek -

The concept could be especially useful for law-enforcements agencies that need to set up a network around an incident scene. They can use the suite of multimedia applications to instant message each other, send photos of suspects, whiteboard on the photos, and stream video if they have cameras connected to their mobile devices. Additionally, they can locate and track different law-enforcement units that are part of the network on electronic maps.
An Instant and Mobile Wireless Mesh Network (Roland Piquepaille's Technology Trends)

WinFS is back?

Skeptics called the next-generation Windows File System (WinFS), nothing more than vaporware after a capricious Redmond decided to rip it from Longhorn (now, Windows Vista). But it seems the company has no intentions of giving up, reports Microsoft-Watch.com. In fact, there are plans to backport the file system into Windows XP.

The WinFS Team Blog, however, doesn't look very promising.

Become.com's Web Crawler

[img_assist|fid=64|thumb=1|alt=Web]

Sun Developer Network is featuring a story on Become.com's Java technology Web crawler that maybe -

[T]he most sophisticated, massively scaled Java technology application in existence, obtaining information on over 3 billion web pages and writing well over 8 terabytes of data (and growing) on 30 fully distributed servers in seven days.

The company has patented its Affinity Index Ranking (AIR) algorithm which -

[P]rovides highly targeted search results by understanding the context of pages on the web. AIR integrates advanced concepts from applied physics and engineering dynamics that Become.com will eventually make public.
Become.com's Web Crawler: A Massively Scaled Java Technology Application