Tuesday 8 January 2013

Synchronized print in Python

Today I faced a problem where I had to print to the standard output in a synchronized way. I put the following code in a file named rpython.py
from __future__ import print_function
from threading import Lock

_global_lock = Lock()
_old_print = print 

def print(*a, **b):
        with _global_lock:
                _old_print(*a, **b)
Now, in every Python module that requires synchronized printing I typed:
from __future__ import print_function
from rpython import print
Alternatively, if you don't want to shadow the default printing you can type:
from __future__ import print_function
from rpython import print as rprint
Have a nice, synchronized day.
EDIT: I should have started with a clear statement that Python's print is not synchronized in any way. Multiple threads writing concurrently to any stream can interfere and usually they do that, so as an effect you get a random mixture of massages which looks really ugly.

Monday 7 January 2013

util.NativeCodeLoader: Unable to load native-hadoop library for your platform.

Problem: when launching a hadoop-based app one can encounter message
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... 
using builtin-java classes where applicable
Solution: type following command in the command line. The path may vary depending on your platform and hadoop version
export LD_LIBRARY_PATH=/usr/lib/hadoop-0.20-mapreduce/lib/native/Linux-amd64-64

How to build jar with Maven containing nothing but dependencies

Today I came across a problem where I had to split my project into two jars: one containing my project's classes and the other one with dependencies. The former got changed every time I made some changes in the source files. The latter was basically constant throughout the development. This facilitates me the deployment of my application. Previously after each improvement I had to transfer a big-fat-and-ugly *-jar-with-dependencies.jar and that took ten good minutes as the file weights 189MB. It turned out that dependencies weight.. 189MB and my compiled classes 45KB. Below the xml that I placed in src/main/resources/assemblies/only-deps.xml

  
  only-deps
  
    jar
  
  false
  
    
      /
      false
      true
      runtime
    
  

and a profile that I added to the main pom.xml:
        
            only-deps
            
                
                    
                        org.apache.maven.plugins
                        maven-assembly-plugin
                        
                            
                                src/main/resources/assemblies/only-deps.xml
                            
                        
                        
                            
                                package
                                create-my-bundle
                                
                                    single
                                
                            
                        
                    
                
            
        
Run the whole thing with mvn install -P only-deps.