File: //proc/self/root/usr/lib/python2.7/site-packages/urlgrabber/mirror.pyc
�
y]Rc           @   s�   d  Z  d d l Z d d l Z d d l Z d d l m Z m Z m Z m Z d d l m	 Z	 m
 Z
 d d l m Z d d l m Z d �  Z
 d f  d	 �  �  YZ d
 f  d �  �  YZ d e f d
 �  �  YZ d e f d �  �  YZ e d k r� n  d S(   s(  Module for downloading files from a pool of mirrors
DESCRIPTION
  This module provides support for downloading files from a pool of
  mirrors with configurable failover policies.  To a large extent, the
  failover policy is chosen by using different classes derived from
  the main class, MirrorGroup.
  Instances of MirrorGroup (and cousins) act very much like URLGrabber
  instances in that they have urlread, urlgrab, and urlopen methods.
  They can therefore, be used in very similar ways.
    from urlgrabber.grabber import URLGrabber
    from urlgrabber.mirror import MirrorGroup
    gr = URLGrabber()
    mg = MirrorGroup(gr, ['http://foo.com/some/directory/',
                          'http://bar.org/maybe/somewhere/else/',
                          'ftp://baz.net/some/other/place/entirely/']
    mg.urlgrab('relative/path.zip')
  The assumption is that all mirrors are identical AFTER the base urls
  specified, so that any mirror can be used to fetch any file.
FAILOVER
  The failover mechanism is designed to be customized by subclassing
  from MirrorGroup to change the details of the behavior.  In general,
  the classes maintain a master mirror list and a "current mirror"
  index.  When a download is initiated, a copy of this list and index
  is created for that download only.  The specific failover policy
  depends on the class used, and so is documented in the class
  documentation.  Note that ANY behavior of the class can be
  overridden, so any failover policy at all is possible (although
  you may need to change the interface in extreme cases).
CUSTOMIZATION
  Most customization of a MirrorGroup object is done at instantiation
  time (or via subclassing).  There are four major types of
  customization:
    1) Pass in a custom urlgrabber - The passed in urlgrabber will be
       used (by default... see #2) for the grabs, so options to it
       apply for the url-fetching
    2) Custom mirror list - Mirror lists can simply be a list of
       stings mirrors (as shown in the example above) but each can
       also be a dict, allowing for more options.  For example, the
       first mirror in the list above could also have been:
         {'mirror': 'http://foo.com/some/directory/',
          'grabber': <a custom grabber to be used for this mirror>,
          'kwargs': { <a dict of arguments passed to the grabber> }}
       All mirrors are converted to this format internally.  If
       'grabber' is omitted, the default grabber will be used.  If
       kwargs are omitted, then (duh) they will not be used.
       kwarg 'max_connections' limits the number of concurrent
       connections to this mirror.  When omitted or set to zero,
       the default limit (2) will be used.
    3) Pass keyword arguments when instantiating the mirror group.
       See, for example, the failure_callback argument.
    4) Finally, any kwargs passed in for the specific file (to the
       urlgrab method, for example) will be folded in.  The options
       passed into the grabber's urlXXX methods will override any
       options specified in a custom mirror dict.
i����N(   t   URLGrabErrort   CallbackObjectt   DEBUGt   _to_utf8(   t
   _run_callbackt	   _do_raise(   t
   exception2msg(   t   _THc         C   s   |  S(   N(    (   t   st(    (    s5   /usr/lib/python2.7/site-packages/urlgrabber/mirror.pyt   _g   s    t   GrabRequestc           B   s   e  Z d  Z RS(   s  This is a dummy class used to hold information about the specific
    request.  For example, a single file.  By maintaining this information
    separately, we can accomplish two things:
      1) make it a little easier to be threadsafe
      2) have request-specific parameters
    (   t   __name__t
   __module__t   __doc__(    (    (    s5   /usr/lib/python2.7/site-packages/urlgrabber/mirror.pyR
   j   s   t   MirrorGroupc           B   s�   e  Z d  Z d �  Z d d g Z d �  Z d �  Z d �  Z d �  Z d �  Z	 i  d	 � Z
 d
 �  Z d �  Z d d � Z d
 �  Z d d � Z RS(   s|  Base Mirror class
    Instances of this class are built with a grabber object and a list
    of mirrors.  Then all calls to urlXXX should be passed relative urls.
    The requested file will be searched for on the first mirror.  If the
    grabber raises an exception (possibly after some retries) then that
    mirror will be removed from the list, and the next will be attempted.
    If all mirrors are exhausted, then an exception will be raised.
    MirrorGroup has the following failover policy:
      * downloads begin with the first mirror
      * by default (see default_action below) a failure (after retries)
        causes it to increment the local AND master indices.  Also,
        the current mirror is removed from the local list (but NOT the
        master list - the mirror can potentially be used for other
        files)
      * if the local list is ever exhausted, a URLGrabError will be
        raised (errno=256, No more mirrors).  The 'errors' attribute
        holds a list of (full_url, errmsg) tuples.  This contains
        all URLs tried and the corresponding error messages.
    OPTIONS
      In addition to the required arguments "grabber" and "mirrors",
      MirrorGroup also takes the following optional arguments:
      
      default_action
        A dict that describes the actions to be taken upon failure
        (after retries).  default_action can contain any of the
        following keys (shown here with their default values):
          default_action = {'increment': 1,
                            'increment_master': 1,
                            'remove': 1,
                            'remove_master': 0,
                            'fail': 0}
        In this context, 'increment' means "use the next mirror" and
        'remove' means "never use this mirror again".  The two
        'master' values refer to the instance-level mirror list (used
        for all files), whereas the non-master values refer to the
        current download only.
        The 'fail' option will cause immediate failure by re-raising
        the exception and no further attempts to get the current
        download.  As in the "No more mirrors" case, the 'errors'
        attribute is set in the exception object.
        This dict can be set at instantiation time,
          mg = MirrorGroup(grabber, mirrors, default_action={'fail':1})
        at method-execution time (only applies to current fetch),
          filename = mg.urlgrab(url, default_action={'increment': 0})
        or by returning an action dict from the failure_callback
          return {'fail':0}
        in increasing precedence.
        
        If all three of these were done, the net result would be:
              {'increment': 0,         # set in method
               'increment_master': 1,  # class default
               'remove': 1,            # class default
               'remove_master': 0,     # class default
               'fail': 0}              # set at instantiation, reset
                                       # from callback
      failure_callback
        this is a callback that will be called when a mirror "fails",
        meaning the grabber raises some URLGrabError.  If this is a
        tuple, it is interpreted to be of the form (cb, args, kwargs)
        where cb is the actual callable object (function, method,
        etc).  Otherwise, it is assumed to be the callable object
        itself.  The callback will be passed a grabber.CallbackObject
        instance along with args and kwargs (if present).  The following
        attributes are defined within the instance:
           obj.exception    = < exception that was raised >
           obj.mirror       = < the mirror that was tried >
           obj.tries        = < the number of mirror tries so far >
           obj.relative_url = < url relative to the mirror >
           obj.url          = < full url that failed >
                              # .url is just the combination of .mirror
                              # and .relative_url
        The failure callback can return an action dict, as described
        above.
        Like default_action, the failure_callback can be set at
        instantiation time or when the urlXXX method is called.  In
        the latter case, it applies only for that fetch.
        The callback can re-raise the exception quite easily.  For
        example, this is a perfectly adequate callback function:
          def callback(obj): raise obj.exception
        WARNING: do not save the exception object (or the
        CallbackObject instance).  As they contain stack frame
        references, they can lead to circular references.
    Notes:
      * The behavior can be customized by deriving and overriding the
        'CONFIGURATION METHODS'
      * The 'grabber' instance is kept as a reference, not copied.
        Therefore, the grabber instance can be modified externally
        and changes will take effect immediately.
    c         K   so   | |  _  |  j | � |  _ d |  _ t j �  |  _ d |  _ |  j	 | � d �  } |  j j
 d | d t � d S(   s�  Initialize the MirrorGroup object.
        REQUIRED ARGUMENTS
          grabber  - URLGrabber instance
          mirrors  - a list of mirrors
        OPTIONAL ARGUMENTS
          failure_callback  - callback to be used when a mirror fails
          default_action    - dict of failure actions
        See the module-level and class level documentation for more
        details.
        i    c         S   sH   t  j |  d � \ } } | o; |  j d i  � j d t � } | | f S(   Nt   mirrort   kwargst   private(   R   t   estimatet   gett   False(   t   mt   speedt   failR   (    (    s5   /usr/lib/python2.7/site-packages/urlgrabber/mirror.pyR     s    %t   keyt   reverseN(   t   grabbert   _parse_mirrorst   mirrorst   _nextt   threadt
   allocate_lockt   _lockt   Nonet   default_actiont   _process_kwargst   sortt   True(   t   selfR   R   R   R   (    (    s5   /usr/lib/python2.7/site-packages/urlgrabber/mirror.pyt   __init__�   s    			
	R"