python: août 2015

mardi 4 août 2015

Assistance with expanding a Sublime Text Plugin

I have a sublime text plugin that watches for the creation of files beginning with lp_

When a file with lp_ prefix is created the plugin creates a folder of the same name with an images folder within.

I would like to watch different areas of my site and create the relevant folder within the nearest lp folder to the created file.

For example I have the following folder strucure

Root > desktop > Root > desktop > lp

Root > Mobile > Root > Mobile > lp

Root > Tablet > Root > Tablet > lp

When a file with lp_ prefix is created in either 'device' folder I would like folder to be created within the nearest lp folder.

The plugin below is along the right lines but I am unsure as how to set rules for targeting specific folders.

import sublime, sublime_plugin, os

# We extend event listener
class ExampleCommand(sublime_plugin.EventListener):
    # This method is called every time a file is saved (not only the first time is saved)
    def on_post_save_async(self, view):
        variables = view.window().extract_variables()
        fileBaseName = variables['file_base_name'] # File name without extension
        path = '/Users/jameshusband/Dropbox/development/remote/http://ift.tt/1P3thlB' + fileBaseName
        imagepath = path + '/images/'

        if fileBaseName.startswith('lp_') and not os.path.exists(path):
            os.mkdir(path)
            os.mkdir(imagepath)

Could anyone point me in the right direction for this? I am not very experienced with Python so am unsure of the best way to achieve my goal.

Parsing XML with namespaces into a dataframe

I have the following simpplified XML:

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:xsd="http://ift.tt/tphNwY" xmlns:xsi="http://ift.tt/ra1lAU"
 xmlns:soap="http://ift.tt/18hkEkn"> 
    <soap:Body>
        <ReadResponse xmlns="ABCDEFG.com">
            <ReadResult>
                <Value>
                    <Alias>x1</Alias>
                    <Timestamp>2013-11-11T00:00:00</Timestamp>
                    <Val>113</Val>
                    <Duration>5000</Duration>
                    <Quality>128</Quality>
                </Value>
                <Value>
                    <Alias>x1</Alias>
                    <Timestamp>2014-11-11T00:02:00</Timestamp>
                    <Val>110</Val>
                    <Duration>5000</Duration>
                    <Quality>128</Quality>
                </Value>
                <Value>
                    <Alias>x2</Alias>
                    <Timestamp>2013-11-11T00:00:00</Timestamp>
                    <Val>101</Val>
                    <Duration>5000</Duration>
                    <Quality>128</Quality>
                </Value>
                <Value>
                    <Alias>x2</Alias>
                    <Timestamp>2014-11-11T00:02:00</Timestamp>
                    <Val>122</Val>
                    <Duration>5000</Duration>
                    <Quality>128</Quality>
                </Value>
            </ReadResult>
        </ReadResponse>
    </soap:Body>
</soap:Envelope>

and would like to parse it into a dataframe with the following structure (keeping some of the tags and discarding the rest):

Timestamp                x1    x2
2013-11-11T00:00:00      113  101
2014-11-11T00:02:00      110  122

The problem is since the XML file includes namespaces, I don't know how to proceed. I have gone through several tutorials (e.g., http://ift.tt/1rjd5Ef) and questions (e.g., How to open this XML file to create dataframe in Python? and Parsing XML with namespace in Python via 'ElementTree') but none of them have helped/worked. I appreciate if anyone can help me sorting this out.

Python EOFError after using os.execl to restart the script

I have a script that does the following;

Checks if a temp.txt file is present in the working directory.
- If present the temp.txt file and update.pyd file is removed.
Checks whether an update.pyd module is present in the working directory.
- If present it imports and runs it.
- The update process creates a temp.txt file in the working directory.
- After the update the script is restarted using os.execl(sys.executable, sys.executable, *sys.argv)

But I keep getting an error when os.execl(sys.executable, sys.executable, *sys.argv) is called:

Traceback (most recent call last):
  File "<string>", line 73, in execInThread
  File "<string>", line 44, in __call__
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 196, in __call__
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 71, in syncreq
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 431, in sync_request
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 379, in serve
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 337, in _recv
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\channel.py", line 50, in recv
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\stream.py", line 166, in read
EOFError: [WinError 10054] An existing connection was forcibly closed by the remote host

I have another process in the same script that does something similar, and restarts using os.execl(sys.executable, sys.executable, *sys.argv), but restarts cleanly.

Could someone tell me why this is happening. What "connection" is the error referring to, and how do I close it before restarting? Thanks

Plot a distance map by apllyind the distance tranform to a list of vlues x and y, in PYTHON

I want to plot a distance map by apllying the distance tranform to a list of vlues x and y, in PYTHON.

I have started to plot the data from a CSV file:

"""PLOTTING DATA FROM A CSV FILE""""" #csv: comma separeted variables file
from matplotlib import pyplot as plt
from matplotlib import style
from pandas import DataFrame

df = DataFrame.from_csv('d:\Users\Raquel\Desktop/test.csv', header=0,sep=';')#unpact values into x and y using pandas to load my csv file

style.use('ggplot')

plt.scatter(df['x'].values,df['y'].values)
plt.title ('Data plot of a Tile')
plt.xlabel('xobj')
plt.ylabel('yobj')

plt.show()

Then I took the coordinates x and y from the file and I applied the distance transform:

 """DISTANCE MAP"""
#distance map: morphology distance transform gives the closeste distance of each pixel to its nearest boundary pixel
   from scipy.ndimage.morphology import distance_transform_edt

# Coordinates
x = df['x'].values
y = df['y'].values

# Distance Transform
dis = distance_transform_edt(x, y) #distance transform applied to x,y
plt.subplot(2,2,1),plt.imshow(dis, origin='lower')
plt.title('Distance Map')

from scipy.ndimage.filters import gaussian_filter #convolution with the gaussian
gf = gaussian_filter(dis, sigma= 2.5)#gaussian filter applied to dis
plt.subplot(2,2,2),plt.imshow(gf, origin='lower')
plt.title('Distance Map with GF')

After that I applied the contour to the Distance Map with the gaussian filter:

"""CONTOUR"""
   import matplotlib.pyplot as plt
   import numpy as np

plt.subplot(2,2,3),plt.contour(gf)
plt.title('Contour')

plt.show()

To finalise I plot the points in the back of the plotting of the contour immage:

"""CONTOUR & HOT SPOTS"""

from matplotlib.pyplot import (colorbar)
plt.subplot(2,2,4),plt.contour(gf) #plotting the  HS in the back of the           image with a transparency = alpha
plt.imshow(dis, origin='lower', alpha=0.19)
plt.title ('Contour & Hot Spots')

plt.show()

When I run my script I have this message and I don't know how to make my script work...

   Traceback (most recent call last):
   File "D:/Users/Raquel/PycharmProjects/BASES/Density Map/v.density map of a tile1.py", line 20, in <module>
   dis = distance_transform_edt(x, y) #distance transform applied to x,y
   File "D:\Users\Raquel\Anaconda\lib\site-packages\scipy\ndimage\morphology.py", line 2111, in distance_transform_edt
sampling = _ni_support._normalize_sequence(sampling, input.ndim)
   File "D:\Users\Raquel\Anaconda\lib\site-packages\scipy\ndimage\_ni_support.py", line 66, in _normalize_sequence
   raise RuntimeError(err)
   RuntimeError: sequence argument must have length equal to input rank

Explain Line of Code

The program returns the largest common subsequence using recursion. However, I need someone to explain me :

return max(lcs_recursive(xlist, ys), lcs_recursive(xs, ylist), key = len)

What exactly, it returns and whether it is the same as just writing:

return max(len(lcs_recursive(xlist,ys), lcs_recursive(xs, ylist))

So basically, I'd like someone to explain me what exactly key = len does.

Code:

s1 = "abc"
s2 = "aebabc"

def lcs_recursive(xlist, ylist):
    if not xlist or not ylist:
        return []
    x, xs, y, ys = xlist[0], xlist[1:], ylist[0], ylist[1:]
    if x==y:
        return [x] + lcs_recursive(xs,ys)
    else:
        return max(lcs_recursive(xlist, ys), lcs_recursive(xs, ylist), key = len)
print lcs_recursive(s1, s2)

django ManyToMany realtionship doesn't work

I'm trying to do this in my models.py:

class Tag(models.Model):
    ''' snip '''
    name = models.CharField(max_length=30)

class Stuff(models.Model):
    kind = models.CharField(max_length=30)
    tag = models.ManyToManyField(Tag)

But when i make qury from Stuff in shell the relationship fields return 'None' like this:

>>> q = Stuff.objects.all()
>>> p = q.tag.name
>>> print q.tag.name
None

I can't use this keys in my template too. Data base backend is mysql.

What is the problem?

Mark interpolated NaN points in Pandas plot

When I use interpolation (or fillna, or any other method of generating some fake data) in Pandas, I would like this to show in my plots. Ideally, I would like to use a different marker for these points in the plot. For regular points I want to use filled circles ('o'), for fake data I want to use crosses ('x').

Of course, I would like to do this with a nice Pythonic oneliner.

One further complication is that I want to use the subplots option in the plot function to plot all my columns at once. I'm hoping manipulating the subplots with Matplotlib voodoo is not necessary, though at this point that's the only option I can think of.

The data I'm using is something like the following (put into file 'meterstanden.ssv'):

datum       tijd   gas[m^3]   electra1[kWh] electra2[kWh]  water[m^3]
2015-03-06  09:00  4000.318   10300         9000           300.0
2015-03-24  20:10  4020.220   -             10003          -
2015-08-02  11:15  4120.388   10500         11000          350.5

And here is the script I'm using to process it:

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_table("meterstanden.ssv", delim_whitespace=True,
                   parse_dates=[[0, 1]], index_col=0, na_values=['-'])

df.interpolate(method='time').plot(subplots=True, layout=(2, 2),
                                   figsize=(14, 10), marker='o')
plt.show()

I want the - entries in the table to be plotted with cross-markers.

Can't use makemigrations - TypeError: can't compare offset-naive and offset-aware datetimes

I'm getting TypeError: can't compare offset-naive and offset-aware datetimes error when i run python manage.py runserver command

In the past i had a problem with timezones, i was using USE_TZ = True but changed it to False as suggested in this question. Not sure if they are releated but i have no idea what is the couse and how to fix this problem.

Python & Pandas: Add a link to existing field

In my pandas table, the url is in ['douban_info']['alt'], I want to use it to turn the existing field ['db_rating'] into a link, maybe something like

pd_data['db_rating'] = '<a href=pd_data['douban_info']['alt'] >pd_data['db_rating']</a>'

but this above certainly doesn't work, I can only do something like:

from IPython.display import HTML
pd.set_option('max_colwidth', 500)
# link is in ['douban_info']['alt']
pd_data['link'] = pd_data['douban_info'].apply(lambda x:  x['alt'])
pd_data['a'] = pd_data['link'].apply(lambda x: '<a href="{0}">link</a>'.format(x))
# drop redundent info to make table look better
pd_data = pd_data.drop('douban_info', 1)
pd_data = pd_data.drop('omdb_info', 1)
pd_data = pd_data.drop('link', 1)
HTML(pd_data[0:5].to_html(escape=False))

This can only add a new a field, which produces

There're certain thing that's really annoying:

To get data from json, I only know to use pd_data['link'] = pd_data['douban_info'].apply(lambda x: x['alt'])
I only know to use info from one field to another. In the case above, from link to a. I don't know how to use link and incoporate it into db_rating, how can I do it?

Filter a 2D numpy array from an array of values

Let's say I have a numpy array with the following shape :

nonSortedNonFiltered=np.array([[9,8,5,4,6,7,1,2,3],[1,3,2,6,4,5,7,9,8]])

I want to :

- Sort the array according to nonSortedNonFiltered[1] - Filter the array according to nonSortedNonFiltered[0] and an array of values

I currently do the sorting with :

sortedNonFiltered=nonSortedNonFiltered[:,nonSortedNonFiltered[1].argsort()]

Which gives : np.array([[9 5 8 6 7 4 1 3 2],[1 2 3 4 5 6 7 8 9]])

Now I want to filter sortedNonFiltered from an array of values, for example :

sortedNonFiltered=np.array([[9 5 8 6 7 4 1 3 2],[1 2 3 4 5 6 7 8 9]])
listOfValues=np.array([8 6 5 2 1])
...Something here...

> np.array([5 8 6 1 2],[2 3 4 7 9]) #What I want to get in the end

Note : Each value in a column of my 2D array is exclusive.

Django REST Framework: overriding get_queryset() sometimes returns a doubled queryset

I made a little endpoint, adapting DRF ReadOnlyModelViewSet, defined as follows:

class MyApi(viewsets.ReadOnlyModelViewSet):

    queryset = []
    serializer_class = MySerializer

    def get_queryset(self):
        print 'Debug: I am starting...\n\n\n\n'
        # do a lot of things filtering data from Django models by some information on neo4j and saving data in the queryset...
        return self.queryset

When I call MyApi via URL, it returns results without any problems, but sometimes it returns doubled result!! It's very strange...It's not a systematic error but happens only sometimes.

I use the line print 'Debug: I am starting...\n\n\n\n' in Apache log to investigate the problem. When that doubling happens, I read in the log:

Debug: I am starting...




Debug: I am starting...

It seems like get_queryset is called more than one time. It's very strange. I didn't report the detail of the logic inside that method, I think the problem is elsewhere or that is a bug...How can I solve?

How to pass variables to pyjade using simple_convert?

Template looks like:

template = """!!! 5
html
    head
        title Some title
...

and render with simple_convert:

html = pyjade.simple_convert(template)

But i need to use jade's if statements inside string template, and passing with .format(variable=value) doesn't work.

UWSGI log rotation

I am using uwsgi 2.0.10 version.

/etc/init/uwsgi.conf :

exec /home/testuser/virtual_environments/app/bin/uwsgi \
--master \
--processes 4 \
--die-on-term \
--uid testuser \
--home /home/testuser/virtual_environments/app \
--pythonpath /home/testuser/app \
--socket /tmp/uwsgi.sock \
--chmod-socket 660 \
--no-site \
--vhost \
--logto /var/log/uwsgi/uwsgi.log

/etc/logrotate.d/uwsgi:

/var/log/uwsgi/uwsgi.log {
    rotate 10
    daily
    compress
    missingok
    create 640 root root
    postrotate
        initctl restart uwsgi >/dev/null 2>&1
    endscript

But when i restart the server throw the message Unknown instance: so i need log rotation for daily log file. please correct if any modification in uwsgi.conf file. please help me.

Python Grouping Data

I have a set of data:

(1438672131.185164, 377961152)                                                                                                       
(1438672132.264816, 377961421)                                                                                                       
(1438672133.333846, 377961690)                                                                                                       
(1438672134.388937, 377961954)                                                                                                      
(1438672135.449144, 377962220)
(1438672136.540044, 377962483)
(1438672137.172971, 377962763)
(1438672138.24253, 377962915)
(1438672138.652991, 377963185)
(1438672139.069998, 377963285)
(1438672139.44115, 377963388)

What I need to figure out is how to group them. Until now I've used a super-duper simple approach, just by diffing two of the second part of the tuple and if the diff was bigger than a certain pre-defined threshold I'd put them into different groups. But it's yielded only unsatisfactory results.

But theoretically I imagine, that it should be possible to determine wether a value of the second part of the tuple belongs to the same group or not, by fitting them on one or multiple line, because I know about the first part of the tuple that it's strictly monotenous because it's a timestamp (time.time()) and I know that all data sets that result will be close to linear. Let's say the tuple is (y, x). There are only three options:

Either all data fits the same equation y = mx + c
Or there is only a differing offset c or
there is an offset c and a different m

The above set would be one group only. The following group would resolve in three groups:

(1438672131.185164, 377961152)                                                                                                       
(1438672132.264816, 961421)                                                                                                       
(1438672133.333846, 477961690)                                                                                                       
(1438672134.388937, 377961954)                                                                                                      
(1438672135.449144, 962220)
(1438672136.540044, 377962483)
(1438672137.172971, 377962763)
(1438672138.24253, 377962915)
(1438672138.652991, 377963185)
(1438672139.069998, 477963285)
(1438672139.44115, 963388)

group1:

(1438672131.185164, 377961152)                                                                                                       
(1438672134.388937, 377961954)                                                                                                      
(1438672136.540044, 377962483)
(1438672137.172971, 377962763)
(1438672138.24253, 377962915)
(1438672138.652991, 377963185)

group2:

(1438672132.264816, 961421)                                                                                                       
(1438672135.449144, 962220)
(1438672139.44115, 963388)

group3:

(1438672133.333846, 477961690)                                                                                                       
(1438672139.069998, 477963285)

Is there a module or otherwise simple solution that will solve this problem? I've found least-squares in numpy and scipy, but I'm not quite sure how to properly use or apply them. If there is another way besides linear functions I'm happy to hear about them as well!

Arcpy.ValidateFieldName does not validate field name?

I am working on a Python script using Arcpy. It creates a shape file, and then adds fields to it with names coming from user input. From the string the user has entered I need to produce a valid field name. I thought arcpy.ValidateFieldName() would accomplish this. However, I am having problems. Consider the code below:

#Create a new shapefile.
arcpy.CreateFeatureclass_management(r"C:\path\to\file", "shape.shp")

#Validate the fieldname.
name = arcpy.ValidateFieldName("0FIELD", "shape")

#Add the field
arcpy.AddField_management("shape", name, "STRING")

Even though the field name has been validated, it throws an error:

Runtime error Traceback (most recent call last):
File "", line 1, in
File "C:\script.py", line 8, in
arcpy.AddField_management("shape", name, "STRING")
File "c:\program files (x86)\arcgis\desktop10.2\arcpy\arcpy\management.py", line 3200, in AddField
raise e ExecuteError: ERROR 000310: The Field name must not start with a number

The function corrects other unallowed character (such as replacing spaces with underscores, and capping the string at 10 characters), but it misses to do something about the first character being a number even though that is not allowed in shape file field names.

Is this a bug, or am I using arcpy.ValidateFieldName() wrongly? Is there some other function I should use? Or will I have to write one myself? What should that one look like?

Using Factory boy ImageFiled results in missing attribute _committed error

I'm trying to set up data for a test case that requires an django.db.models.ImageField. I'm trying to use factory.django.ImageField from factory-boy, but get the error AttributeError: 'ImageField' object has no attribute '_committed'

Simplified django object:

class GalleryImage(models.Model):
    image = models.ImageField(upload_to='uploads/products')

Factory class:

class GalleryImageFactory(factory.DjangoModelFactory):
    class Meta:
        model = models.GalleryImage

usage in test:

img = factory.django.ImageField(filename='test.jpg')
GalleryImageFactory.create(image=img)

This will give me the following error pointing to the line creating the GalleryImageFactory.

AttributeError: 'ImageField' object has no attribute '_committed'

I'm running python (2.7.6), factory-boy (2.4.1) and Django (1.6.8)

Full stack trace:

Traceback (most recent call last):
  File "/home/vagrant/sportamore/tests/sportamor_tests/catalog/test_pricefeed_google.py", line 80, in setUp
    image=img)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/factory/base.py", line 585, in create
    return cls._generate(True, attrs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/factory/base.py", line 510, in _generate
    obj = cls._prepare(create, **attrs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/factory/base.py", line 485, in _prepare
    return cls._create(model_class, *args, **kwargs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/factory/django.py", line 153, in _create
    return manager.create(*args, **kwargs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/manager.py", line 157, in create
    return self.get_queryset().create(**kwargs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 322, in create
    obj.save(force_insert=True, using=self.db)
  File "/home/vagrant/sportamore/sportamor/catalog/models/__init__.py", line 3960, in save
    super(GalleryImage, self).save(*args, **kwargs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/base.py", line 545, in save
    force_update=force_update, update_fields=update_fields)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/base.py", line 573, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/base.py", line 654, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/base.py", line 687, in _do_insert
    using=using, raw=raw)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 1514, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 902, in execute_sql
    for sql, params in self.as_sql():
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 860, in as_sql
    for obj in self.query.objs
  File "/home/vagrant/sportamore/venv/local/lib/python2.7/site-packages/django/db/models/fields/files.py", line 250, in pre_save
    if file and not file._committed:
AttributeError: 'ImageField' object has no attribute '_committed'

Any help appreciated, including tips of other methods of getting a valid image with the specified filename in place. Thanks in advance!

Loading neo4j query result into python's `igraph` graph

How would you load the results of a Cypher query into an igraph in python, keeping all the edge and vertex attributes?

How to free resource in PyGame mixer?

I use gTTS python module to get mp3 from Google Text-To-Speech API and PyGame to play output mp3 files without opening external player (is there any simpler way to do it?)

However it seems like PyGame mixer doesn't free file resource, even after it's quit method.

phrase = "Hello!"
tts = gtts.gTTS(text=phrase, lang='en')
tts.save("googleTTS.mp3")

mixer.init(16000)   # should be dynamic
mixer.music.load("googleTTS.mp3")
mixer.music.play()
while mixer.music.get_busy() == True:
    continue
mixer.quit()        # doesn't free resource?

phrase = "Bye!"
tts = gtts.gTTS(text=phrase, lang='en')
tts.save("googleTTS.mp3")

Last line gives exception:

    IOError: [Errno 13] Permission denied: 'googleTTS.mp3'

I should notice that the problem isn't in tts.save function, cause code without mixer works fine.

How can I free mixer resource and use the same file over and over again?

Celery error : result.get times out

I've installed Celery and I'm trying to test it with the Celery First Steps Doc.

I tried using both Redis and RabbitMQ as brokers and backends, but I can't get the result with :

result.get(timeout = 10)

Each time, I get this error :

Traceback (most recent call last): File "", line 11, in File "/home/mehdi/.virtualenvs/python3/lib/python3.4/site-packages/celery/result.py", line 169, in get no_ack=no_ack, File "/home/mehdi/.virtualenvs/python3/lib/python3.4/site-packages/celery/backends/base.py", line 225, in wait_for raise TimeoutError('The operation timed out.') celery.exceptions.TimeoutError: The operation timed out.

The broker part seems to work just fine : when I run this code

from celery import Celery

app = Celery('tasks', backend='redis://localhost/', broker='amqp://')

@app.task
def add(x, y):
    return x + y

result = add.delay(4,4)

I get (as expected)

[2015-08-04 12:05:44,910: INFO/MainProcess] Received task: tasks.add[741160b8-cb7b-4e63-93c3-f5e43f8f8a02]

[2015-08-04 12:05:44,911: INFO/MainProcess] Task tasks.add[741160b8-cb7b-4e63-93c3-f5e43f8f8a02] succeeded in 0.0004287530000510742s: 8

P.S : I'm using Xubuntu 64bit

Reading specific columns from CSV Python

I am trying to parse through a CSV file and extract few columns from the CSV.

ID | Code | Phase |FBB | AM | Development status | AN REMARKS | stem | year |   IN -NAME |IN Year |Company                                                                                                      
L2106538 |Rs124 | 4 | | | Unknown | | -pre- | 1982 | Domoedne | 1982 | XYZ

I would like to group and extract few columns for uploading them to different models.

For example I would like to group first 3 columns to a model, next two to a different model, first column and the 6, 7 to a different model and so on.

I also need to keep the header of the file and store the data as key value pair so that I would know which column should go for a particular field in a model.

This is what I have so far.

def group_header_value(file):
    reader = csv.DictReader(open(file, 'r'))# to have the header and get the data as a key value pair.
    all_result= []
    for row in reader:
        print row
        all_result.append(row)
    return all_result


def group_by_models(all_results):
    MD = range(1,3) # to get the required cols. 
    for every_row in all_results:
        contents = [(every_row[i] for i in MD)]
        print contents

def handle(self, *args, **options):
        database = options.get('database')
        filename = options.get('filename')
        all_results =  group_header_value(filename)
        print 'grouped_bymodel', group_by_models(all_results)

This is what I get when I try to get the contents grouped_by model: at 0x7f9f5382e0f0> at 0x7f9f5382e0a0> at 0x7f9f5382e0f0>

Is there a different approach to extract particular columns in DictReader? how else can I extract required columns using DictReader. Thanks

Flask 301 redirect on some parameters

I am setting up a simple flask REST API but ran into an issue when testing where only one of my id's is causing the application to throw a 301. I can't tell why this ID is behaving differently then all the others.

object_view = ObjectView.as_view('object') #subclass of MethodView
app.add_url_rule('/objects/',  view_func=object_view, methods=['GET',])
app.add_url_rule('/objects/', view_func=object_view, methods=['POST',])
app.add_url_rule('/objects/<object_id>', view_func=object_view, methods=['GET',])

This routing code works great except when object_id is 88d63017-25ac-4c81-a637-1e6207986bc4. When I use that object_id I get a 301 and redirect to the base /objects/ list.

I tried it with trailing slashes as well and that does seem to fix it /objects/88d63017-25ac-4c81-a637-1e6207986bc4/ returns a 200 as expected but I am unsure of why this works and it breaks my conventions to do so.

Sending POST request with JSON data in DJANGO and response from view also JSON data but its giving 403 FORBIDDEN error

I am trying to send a POST request to Django with JSON data in it and the view is returning response with JSON data. But when I send a request to it, it returns with 403 Forbidden error. I am using RESTClient to send/test POST requests.

I have read all about CSRF in documentation but its not very helpful. I am fairly new to Django and the other questions posted here are not helping me a lot.

The code in my view is:

from django.shortcuts import render
from django.http import HttpResponse;
import json;
def index(request):
        if request.is_ajax():
            if request.method == 'POST':
                print 'Raw Data: "%s"' % request.body;
                reply = json.loads(request.body);
                return HttpResponse(reply);
            else:
                return HttpResponse("OK");

        else:
            return HttpResponse("OK");

Django Queryset for Forms not working

I don't know if I am trying the impossible but I have HTML select option that comes from Django's forms.Form's queryset, States.objects.all().

Model:

class Countries(models.Model):
    name = models.CharField(max_length=25)

Model:

class States(models.Model):
    country_id = models.ForeignKey('Countries')
    name = models.CharField(max_length=25)

Form:

class sign_up_form_school(forms.Form):
    states = forms.ModelChoiceField(
        queryset = States.objects.all(), 
        widget=forms.Select(attrs={
          'class': states.country_id.name #is this POSSIBLE?
          }))

I want each select option value to have different class as I have tried above but it returns error: name 'states' is not defined.

Can you access ManyToManyField values before saving in Django 1.7?

class Playlist(models.Model):
    name = models.CharField(max_length=50)
    num_of_songs = models.IntegerField(max_length=15)
    duration = models.IntegerField()
    songs = models.ManyToManyField("Song", blank=True)

    def save(self, *args, **kwargs):
        self.num_of_songs = self.songs.count() ==> this is wrong, but you get the idea
        super(Playlist, self).save(*args, **kwargs)

NOTE: The save method above is not a part of the original code, I added this to show you what I'm going for.

I have a Playlist model in Django as described above. When I create a playlist from my views, I use a ModelForm to save the object. In other words i do a form.save(), and then append the correct num_of_songs and duration based on the songs that are saved in the database, and then i do a playlist.save().

That all works fine and dandy, but if I create a playlist from django admin however, I have to manually enter those values, and they won't be changed unless I change the list from the view. So in admin I can enter wrong values and they'll stay that way.

When I try to override the save method, I can't get the selected songs because they're not saved in the database yet.

So my question is, is there a way to read the selected values from ManyToManyField before save? (I guess I could use a form, but working from admin I'm not sure if that's possible.)

Limited ODB output in ABAQUS

I'm writing an application for topology optimization within the ABAQUS PDE. As I have quite some iterations, in each of which FEM is performed, a lot of data is written to the system -- and thus a lot of time is lost on I/O.

Is it possible to limit the amount of information that gets written into the ODB file?

HTML ID value validation by RE

Problem Statement:

ID value must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").

I done by regular expression.

Code:

>>> import re
>>> id_value1 =  "custom-title1"
>>> id_value2 =  "1-custom-title"
>>> pattern = "[A-Za-z][\-A-Za-z0-9_:\.]*"

Code for valid ID value

>>> flag= False
>>> try:
...     if re.finadll(pattern, id_value1)[0]==id_value1:
...         flag=True
... except:
...     pass
... 
>>> print flag
False

Code for invalid ID value:

>>> flag = False
>>> try:
...     if re.findall(pattern, id_value2)[0]==id_value2:
...         flag=True
... except IndexError:
...     pass
... 
>>> print flag
False

Code for IndexError

>>> try:
...     if re.findall(pattern, "")[0]=="":
...         print "In "
... except IndexError:
...    print "Exception Index Error"
... 
Exception Index Error
>>>

I will move above code in one function. This function will call more then 1000 times. So can anyone optimize above code?

Python - Random Forest - Iteratively adding trees

Good afternoon! Sorry for my English, but I need your help. I am doing some machine learning task on Python. I need to build RandomForest and than build a graph that will show how the quality of the training and test samples depends on the number of trees in the Random Forest. It is necessary each time to build a new Random Forest with a certain number of trees? Or i can somehow iteratively add trees (if it possible, can you give the exapmle of code how to do that)?

regex pattern for a reference point

Using regex in python, I want to find a regex pattern and make the pattern a reference point for the text that follows. Then when the pattern is found again, it begins another record with the second-found-pattern as a reference point for everything that follows until the next pattern and its records.

I have been studying search, findall, and match, and group(), but I can't seem to find out how to do this. I know this isn't a perfect question, but I'm a noob. Thanks for any direction.

Docker: How to run cython_extensions?

FROM ubuntu:14.04.2
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN apt-get -y update && apt-get upgrade -y
RUN apt-get install python build-essential python-dev python-pip python-setuptools -y
RUN apt-get install libxml2-dev libxslt1-dev python-dev -y
RUN apt-get install libpq-dev postgresql-common postgresql-client -y
RUN apt-get install openssl openssl-blacklist openssl-blacklist-extra -y
RUN apt-get install nginx -y
RUN pip install "pip>=7.0"
RUN pip install virtualenv uwsgi

ADD canonicaliser_api /home/ubuntu/canonicaliser_api
RUN virtualenv /home/ubuntu/canonicaliser_api/venv
RUN source /home/ubuntu/canonicaliser_api/venv/bin/activate && pip install -r /home/ubuntu/canonicaliser_api/requirements.txt
RUN export CFLAGS=-I/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/include/
RUN source /home/ubuntu/canonicaliser_api/venv/bin/activate && \
    python /home/ubuntu/canonicaliser_api/canonicaliser/cython_extensions/setup.py \
      build_ext --inplace

The last line crashes with:

  Traceback (most recent call last):
  File "/home/ubuntu/canonicaliser_api/canonicaliser/cython_extensions/setup.py", line 5, in <module>
    ext_modules = cythonize("*.pyx")
  File "/home/ubuntu/canonicaliser_api/venv/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 754, in cythonize
    aliases=aliases)
  File "/home/ubuntu/canonicaliser_api/venv/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 649, in create_extension_list
    for file in nonempty(extended_iglob(filepattern), "'%s' doesn't match any files" % filepattern):
  File "/home/ubuntu/canonicaliser_api/venv/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 103, in nonempty
    raise ValueError(error_msg)
ValueError: '*.pyx' doesn't match any files
...

What am I missing please?

Plot 4th dimension with Python

I would to know if there is the possibility to plot in four dimensions using python. In particular I would to have a tridimensional mesh X, Y, Z and f(X,Y,Z) = 1 or f(X,Y,Z) = 0. So I need to a symbol (for example "o" or "x") for some specific point (X,Y,Z). I don't need to a color scale.

Note that I have 100 matrices (512*512) composed by 1 or 0: so my mesh should be 512*512*100.

I hope I have been clear! Thanks.

How to remove table name from a SQLAlchemy WHERE statement [duplicate]

This question already has an answer here:

Can I use SQLAlchemy with Cassandra CQL? 3 answers

I am using SQLAlchemy Core with PostgreSQL, but I'm also using the Non-SQL db Cassandra, and I would like to keep using the SQLAlchemy's query builder with Cassandra, cause his CQL language is quite similar in most cases. The thing is, when I am trying to run a statement like this:

stmt = select([users.c.name]).select_from(users).where(users.c.id == id)

I am getting this query as result:

SELECT name FROM users WHERE users.id == :id

Is there a way in SQLAlchemy to get the id in the WHERE clause without the table prefix: users.id, so the queries can work also in Cassandra?

Parsing text afterwards is not allowed!

Thanks

Pulling posts & messages from the last year on facebook for project

I want to try and pull posts and messages from the last year from Facebook for my project. I've managed to pull the data, but I don't know how I would pull it for the last 365 days specifically.

Here's my code so far:

#Modules
import requests
import facebook

def some_action(post):
    print posts['data']
    print post['created_time']

#Token
access_token = '...'
user = 'walkers'

#Posts
graph = facebook.GraphAPI(access_token)
profile = graph.get_object(user)
posts = graph.get_connections(profile['id'], 'posts')

x = 0
while x < 10:
    #while True:
    try:
        posts = requests.get(posts['paging']['next']).json()
        #print posts

    except KeyError:
        break
    x = x+1

#Write
f = open("111.txt", "w")
f.write (str(posts) + "\n")
f.close()

And here's some of the the output:

{u'paging': {u'next': u'http://ift.tt/1KNIaK1', u'previous': u'http://ift.tt/1ImpzPx'}, u'data': [{u'picture': u'http://ift.tt/1KNIaK3', u'story': u'Walkers added 8 new photos to the album: 20 Years of Gary.', u'likes': {u'paging': {u'cursors': {u'after': u'MzgxNzk1MDQ4NjQ3MTI3', u'before': u'MTAxNTI0OTY1MDIwNDIyODI='}}, u'data': [{u'id': u'10152496502042282', u'name': u'Aaron Hanson'}, {u'id': u'10203040513950876', u'name': u'Gary GazRico Hinchliffe'}, {u'id': u'10152934096109345', u'name': u'Stuart Collister'}, {u'id': u'10152297022606059', u'name': u'Helen Preston'}, {u'id': u'326285380900188', u'name': u'Rhys Edwards'}, {u'id': u'10204744346589601', u'name': u'Aaron Benfield'}, {u'id': u'10200910780691953', u'name': u'Mike S Q Wilkins'}, {u'id': u'10204902354187051', u'name': u'Paul Owen Davies'}, {u'id': u'10152784755311784', u'name': u'Dafydd Ifan'}, {u'id': u'1517704468487365', u'name': u'Stephen Collier'}, {u'id': u'10202198826115234', u'name': u'John McKellar'}, {u'id': u'10151949129487143', u'name': u'Lucy Morrison'}, {u'id': u'1474199509524133', u'name': u'Christine Leek'}, {u'id': u'381795048647127', u'name': u'Sandra Taylor'}]}, u'from': {u'category': u'Product/Service', u'name': u'Walkers', u'id': u'53198517648'}, u'name': u'20 Years of Gary', u'is_hidden': False, u'privacy': {u'allow': u'', u'deny': u'', u'friends': u'', u'description': u'', u'value': u''}, u'is_expired': False, u'actions': [{u'link': u'http://ift.tt/1ImpBH5', u'name': u'Comment'}, {u'link': u'http://ift.tt/1ImpBH5', u'name': u'Like'}], u'updated_time': u'2015-05-07T11:13:26+0000', u'caption': u'To celebrate Gary\u2019s 20 years as the face of Walkers, we\u2019re recreating some scenes from his first ad in the ultimate #TBT! The cast have waited two decades for this sequel. #CantHelpButSmile', u'link': u'http://ift.tt/1KNIaK5', u'object_id': u'10153202855322649', u'shares': {u'count': 71}, u'story_tags': {u'0': [{u'length': 7, u'offset': 0, u'type': u'page', u'id': u'53198517648', u'name': u'Walkers'}]}, u'created_time': u'2015-05-07T11:13:26+0000', u'message': u'To celebrate Gary\u2019s 20 years as the face of Walkers, we\u2019re recreating some scenes from his first ad in the ultimate #TBT! The cast have waited two decades for this sequel. #CantHelpButSmile', u'type': u'photo', u'id': u'53198517648_10153202855322649', u'status_type': u'added_photos', u'icon': u'http://ift.tt/1JLaV6n'}, {u'picture': u'http://ift.tt/1KNI9pv', u'is_hidden': False, u'likes': {u'paging': {u'cursors': {u'after': u'NjIwNTUxMDg0NzAwODgx', u'before': u'Njk5NjczMDQzNDYxODgx'}, u'next': u'http://ift.tt/1ImpzPz'}, u'data': [{u'id': u'699673043461881', u'name': u'Summer Louise Lee'}, {u'id': u'778869825481375', u'name': u'Princess Nicole-sunny \u015eenyurt'}, {u'id': u'10203535327456872', u'name': u'Tamtam Harbi'}, {u'id': u'830511573639656', u'name': u'Jennifer Baker'}, {u'id': u'721613451224404', u'name': u'Ana Monroy'}, {u'id': u'383509561823365', u'name': u'Sheila Baud'}, {u'id': u'383392095146766', u'name': u'Gemma Fletcher'}, {u'id': u'10100758061580924', u'name': u'Rosalind Lee'}, {u'id': u'10152407789928449', u'name': u'Louise Allen'}, {u'id': u'10153154705875787', u'name': u'Tracey Hunt'}, {u'id': u'1570440839870835', u'name': u'Christine Bray'}, {u'id': u'462703947225536', u'name': u'David Williams'}, {u'id': u'1421406008149892', u'name': u'Caitlyn Molyneux'}, {u'id': u'10152538544760959', u'name': u'Hima Thanki'}, {u'id': u'608844039202546', u'name': u'Sara Veiga'}, {u'id': u'1692079987687260', u'name': u'Fraser George Stevens'}, {u'id': u'1410872812543572', u'name': u'Abby Sanderson'}, {u'id': u'10152954172786298', u'name': u'Jasmin Gosalia'}, {u'id': u'10204166664027765', u'name': u'Anthony Millward'}, {u'id': u'10152480783622038', u'name': u'Rachel Holms'}, {u'id': u'10202286549805262', u'name': u'Farzana Kausar'}, {u'id': u'787359004643685', u'name': u'Hayley Green'}, {u'id': u'305329966329014', u'name': u'Jemma Bailey'}, {u'id': u'972264679466809', u'name': u'Margaret Catley'}, {u'id': u'620551084700881', u'name': u'Suzanne Wrigley'}]}, u'from': {u'category': u'Product/Service', u'name': u'Walkers', u'id': u'53198517648'}, u'comments': {u'paging': {u'cursors': {u'after': u'MQ==', u'before': u'MQ=='}}, u'data': [{u'from': {u'name': u'David Williams', u'id': u'462703947225536'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-05-03T08:57:05+0000', u'message': u'Yes takes me back, not to the pram , but when I could eat crisps with out worrying about my weight.\U0001f612', u'id': u'10153192551122649_10153194592792649', u'user_likes': False}]}, u'privacy': {u'allow': u'', u'deny': u'', u'friends': u'', u'description': u'', u'value': u''}, u'is_expired': False, u'actions': [{u'link': u'http://ift.tt/1KNIaK7', u'name': u'Comment'}, {u'link': u'http://ift.tt/1KNIaK7', u'name': u'Like'}], u'properties': [{u'text': u'00:07', u'name': u'Length'}], u'source': u'http://ift.tt/1ImpBHd', u'link': u'http://ift.tt/1KNI9px', u'object_id': u'10153192551122649', u'shares': {u'count': 10}, u'created_time': u'2015-05-02T11:17:49+0000', u'message': u'#CantHelpButSmile', u'updated_time': u'2015-05-03T08:57:05+0000', u'type': u'video', u'id': u'53198517648_10153192551122649', u'status_type': u'added_video', u'icon': u'http://ift.tt/1JLaVTY'}, {u'picture': u'http://ift.tt/1KNI9pz', u'is_hidden': False, u'likes': {u'paging': {u'cursors': {u'after': u'ODc1MDc4MDc1ODQxNzg4', u'before': u'Nzc4ODY5ODI1NDgxMzc1'}, u'next': u'http://ift.tt/1ImpBXv'}, u'data': [{u'id': u'778869825481375', u'name': u'Princess Nicole-sunny \u015eenyurt'}, {u'id': u'10203535327456872', u'name': u'Tamtam Harbi'}, {u'id': u'842151215867368', u'name': u'Waqas Javid'}, {u'id': u'925564307454778', u'name': u'Steph Louise'}, {u'id': u'10152641983603000', u'name': u'Harry Vincent'}, {u'id': u'513166088828812', u'name': u'Dillon Morgan-Jones'}, {u'id': u'1029234627087388', u'name': u'Alison Richards'}, {u'id': u'10152636466428548', u'name': u'Elin Mai Jones'}, {u'id': u'1791653647727554', u'name': u'Cameron Ward'}, {u'id': u'1035254736490945', u'name': u'Umer Choudhary'}, {u'id': u'1571886136400010', u'name': u'Liam Llewellyn'}, {u'id': u'10205154376362734', u'name': u'Kate Neesham'}, {u'id': u'10152034590826596', u'name': u'Tim Allan'}, {u'id': u'550381381747086', u'name': u'Stephanie Tracey Stubbs Fry'}, {u'id': u'10154569916710635', u'name': u'Becky Cronin'}, {u'id': u'10153110708653885', u'name': u'Jason Plastow'}, {u'id': u'10203270854452666', u'name': u'Sean Uprichard'}, {u'id': u'778773632211715', u'name': u'Shohel Rana'}, {u'id': u'10200984087444257', u'name': u'Cally Wilson'}, {u'id': u'10152010672247327', u'name': u'Daniel Newbie Newman'}, {u'id': u'862886047084648', u'name': u'Niamh Angel Hodge'}, {u'id': u'10152670558197305', u'name': u'Steven Dixon'}, {u'id': u'4788216560078', u'name': u'Marcin Franke'}, {u'id': u'615901501858007', u'name': u'Arham Chaudhry'}, {u'id': u'875078075841788', u'name': u'Jordon Adams'}]}, u'from': {u'category': u'Product/Service', u'name': u'Walkers', u'id': u'53198517648'}, u'comments': {u'paging': {u'cursors': {u'after': u'MQ==', u'before': u'OA=='}}, u'data': [{u'from': {u'name': u'Mark Brown', u'id': u'10154058875275543'}, u'like_count': 6, u'can_remove': False, u'created_time': u'2015-05-02T08:52:30+0000', u'message': u'This looks absolutely dreadful', u'id': u'10153186245247649_10153192311622649', u'user_likes': False}, {u'from': {u'name': u'Jonny Nicholson', u'id': u'753425971376851'}, u'like_count': 1, u'can_remove': False, u'created_time': u'2015-05-02T21:06:15+0000', u'message': u'when will you be making full bag?', u'id': u'10153186245247649_10153193614087649', u'user_likes': False}, {u'from': {u'name': u'Emma Louise Webb', u'id': u'10152574201891007'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-05-02T18:01:03+0000', u'message': u'Wow amazing', u'id': u'10153186245247649_10153193237142649', u'user_likes': False}, {u'from': {u'name': u'Thom Williams', u'id': u'10153528602434310'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-05-03T12:57:49+0000', u'message': u'Jesus christ', u'id': u'10153186245247649_10153194928492649', u'user_likes': False}, {u'from': {u'name': u'Mattie Bradshaw', u'id': u'10154017626580117'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-05-03T14:37:20+0000', u'message': u'Tragic marketing', u'id': u'10153186245247649_10153195099017649', u'user_likes': False}, {u'from': {u'name': u'Loriley Sessions', u'id': u'10152696432105155'}, u'message_tags': [{u'length': 10, u'offset': 0, u'type': u'user', u'id': u'10154384070225403', u'name': u'James Clee'}], u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-29T16:19:03+0000', u'message': u'James Clee ya main man', u'id': u'10153186245247649_10153186337617649', u'user_likes': False}]}, u'privacy': {u'allow': u'', u'deny': u'', u'friends': u'', u'description': u'', u'value': u''}, u'is_expired': False, u'actions': [{u'link': u'http://ift.tt/1KNI9pB', u'name': u'Comment'}, {u'link': u'http://ift.tt/1KNI9pB', u'name': u'Like'}], u'properties': [{u'text': u'00:39', u'name': u'Length'}], u'source': u'http://ift.tt/1ImpA5T', u'link': u'http://ift.tt/1KNIaKb', u'object_id': u'10153186245247649', u'shares': {u'count': 4}, u'created_time': u'2015-04-29T16:00:01+0000', u'message': u'Need a laugh? Here\u2019s our latest Snackdown podcast, a monthly roundup of news with a group of comedy friends. #CantHelpButSmile http://apple.co/1dsPrk0', u'updated_time': u'2015-04-29T16:00:01+0000', u'type': u'video', u'id': u'53198517648_10153186245247649', u'status_type': u'added_video', u'icon': u'http://ift.tt/1JLaVTY'}, {u'picture': u'http://ift.tt/1ImpA5V', u'is_hidden': False, u'likes': {u'paging': {u'cursors': {u'after': u'NjM0MzMyOTczMzU5NDI0', u'before': u'MTM4MzY5MDcwMTkzMjAzMg=='}, u'next': u'http://ift.tt/1KNIaKd'}, u'data': [{u'id': u'1383690701932032', u'name': u'Phil Bonsall'}, {u'id': u'778869825481375', u'name': u'Princess Nicole-sunny \u015eenyurt'}, {u'id': u'10203535327456872', u'name': u'Tamtam Harbi'}, {u'id': u'10152575804559808', u'name': u'Justine Edwards'}, {u'id': u'775351169210278', u'name': u'Nathan Evans'}, {u'id': u'808059839312508', u'name': u'Tanya Houghton'}, {u'id': u'398300857023260', u'name': u'Angel Jupe'}, {u'id': u'511529258982389', u'name': u'Amyy James'}, {u'id': u'528646077244967', u'name': u'Finley Rowland'}, {u'id': u'333791050140805', u'name': u'Usman Ali'}, {u'id': u'325500274322410', u'name': u'Shaneil Swaby'}, {u'id': u'731784150271691', u'name': u'Abi Meadows'}, {u'id': u'514048395438260', u'name': u'Erin Tylor Ava Johnson'}, {u'id': u'1475264966125569', u'name': u'Teresa Stubbs'}, {u'id': u'1562679640663006', u'name': u'Kira Dunford'}, {u'id': u'10203874163608982', u'name': u'Matt Reoch'}, {u'id': u'647680085337571', u'name': u'Miszz D Ox'}, {u'id': u'373314132842801', u'name': u'Aoife Mc Erlean'}, {u'id': u'684877051580321', u'name': u'Rajwinder Dhillon'}, {u'id': u'10155891196985346', u'name': u'Joanna Arnold'}, {u'id': u'10155155762375301', u'name': u"Ali 'Als' Moore"}, {u'id': u'10154103048425192', u'name': u'Jolene Chalk'}, {u'id': u'476069229195785', u'name': u'Mollie Ashfield'}, {u'id': u'607143526096436', u'name': u'Liam Costa'}, {u'id': u'634332973359424', u'name': u'Chadley Peakman'}]}, u'from': {u'category': u'Product/Service', u'name': u'Walkers', u'id': u'53198517648'}, u'comments': {u'paging': {u'cursors': {u'after': u'ODI=', u'before': u'MTA2'}, u'next': u'http://ift.tt/1ImpBXF'}, u'data': [{u'from': {u'name': u'Jack Reacher', u'id': u'10152949153527863'}, u'like_count': 46, u'can_remove': False, u'created_time': u'2015-04-26T10:32:19+0000', u'message': u'It would have to be a very small sandwich if we were to use a bag of walkers on it!!!', u'id': u'10153170723582649_10153177520892649', u'user_likes': False}, {u'from': {u'name': u'Meg Awuah', u'id': u'641625565957901'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-27T19:05:22+0000', u'message': u'I won a free lunch \U0001f44c\U0001f3fb\nThanks Walkers', u'id': u'10153170723582649_10153181235597649', u'user_likes': False}, {u'from': {u'name': u'Kyle Thomas Spence', u'id': u'10152444873003458'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-28T22:06:47+0000', u'message': u'How about a 1 in 6 chance of crisps in the bag? Would be a nice change.', u'id': u'10153170723582649_10153184604542649', u'user_likes': False}, {u'from': {u'name': u'Steve Hewitt', u'id': u'10205646534593447'}, u'like_count': 23, u'can_remove': False, u'created_time': u'2015-04-27T16:52:37+0000', u'message': u'A full packet would do for me!', u'id': u'10153170723582649_10153180951107649', u'user_likes': False}, {u'from': {u'name': u'Maxine Seller', u'id': u'1572191589662328'}, u'like_count': 1, u'can_remove': False, u'created_time': u'2015-04-27T11:51:45+0000', u'message': u'Iv won 3 vouchers so far \U0001f60a thankyou walkers \U0001f496', u'id': u'10153170723582649_10153180275017649', u'user_likes': False}, {u'from': {u'name': u'Kayleigh Bowe', u'id': u'1206523482707595'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-29T16:58:45+0000', u'message': u'Was just wondering what banana flavoured walkers crisps would taste like x hmmm x mmmm! X lol', u'id': u'10153170723582649_10153186403352649', u'user_likes': False}, {u'from': {u'name': u'Jenny Wark', u'id': u'10153197354099670'}, u'like_count': 3, u'can_remove': False, u'created_time': u'2015-04-27T07:54:26+0000', u'message': u'Crisps on a a sandwich is just WRONG!!!!!!!!\U0001f61d', u'id': u'10153170723582649_10153179966542649', u'user_likes': False}, {u'from': {u'name': u'Lisa Watson', u'id': u'297450463798269'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-28T20:14:07+0000', u'message': u'I like strawberry jam sandwich  with a bag of cheese and onion crisps.  x', u'id': u'10153170723582649_10153184252522649', u'user_likes': False}, {u'from': {u'name': u'Alison Largue', u'id': u'10152452678791016'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-27T14:57:16+0000', u'message': u'Banana with salt and vinegar crisps :-) or cheddar cheese with prawn cocktail crisps. Sandwiches have to be cut into squares though :-D', u'id': u'10153170723582649_10153180732142649', u'user_likes': False}, {u'from': {u'name': u'Stacey Jayne Goodwin', u'id': u'10154913080295714'}, u'like_count': 18, u'can_remove': False, u'created_time': u'2015-04-27T10:21:16+0000', u'message': u'Bring us back nom nom', u'id': u'10153170723582649_10153180140847649', u'user_likes': False}, {u'from': {u'name': u'Siobhan Michelle Hishon', u'id': u'10204461658164675'}, u'like_count': 0, u'can_remove': False, u'created_time': u'2015-04-27T12:52:13+0000', u'message': u'I used to love jam sandwiches with cheese and onion crisps tastes lovely', u'id': u'10153170723582649_10153180466617649', u'user_likes': False}...

As you can see there is a date in the code, but I doubt this would be recognized as a date? Best thing I could do is perhaps specify today's date and say pull from a year before this date. But I'm not sure how I would do this. Does anyone know how to do this?

traceback while using subprocess.call

import sys
import subprocess
arg1= sys.argv[1]
subprocess.call("inversion_remover.py",arg1)
subprocess.call("test3.py")
subprocess.call("test4.py")

I am getting the following traceback

Traceback (most recent call last):
  File "parent.py", line 4, in <module>
    subprocess.call("inversion_remover.py",arg1)
  File "/usr/lib/python2.7/subprocess.py", line 522, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib/python2.7/subprocess.py", line 659, in __init__
    raise TypeError("bufsize must be an integer")
TypeError: bufsize must be an integer

How do I solve the above traceback?

Python : Extract values of 1 day from a dictionnary with datetime keys [on hold]

I have a dictionnary with a datetime as key and a numbre as value.

dict = {'08/07/2015 01:15':'3', '08/07/2015 08:15':'5',
 '09/07/2015 07:15':'4', '09/07/2015 10:30':'8'}

I want extract values of each day. For example, on the 09/07/2015 I want this result :

result = {'09/07/2015 07:15': '4', '09/07/2015 10:30': '8'}

result = [4, 8]

Thank you for your help.

Spark + Python - how to set the system environment variables?

I'm on spark-1.4.1. How can I set the system environment variables for Python?

For instance, in R,

Sys.setenv(SPARK_HOME = "C:/Apache/spark-1.4.1")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))

What about in Python?

import os
import sys

from pyspark.sql import SQLContext

sc = SparkContext(appName="PythonSQL")
sqlContext = SQLContext(sc)

# Set the system environment variables.
# ref: http://ift.tt/1ImpBqH
if len(sys.argv) < 2:
    path = "file://" + \
        os.path.join(os.environ['SPARK_HOME'], "examples/src/main/resources/people.json")
else:
    path = sys.argv[1]

# Create the DataFrame
df = sqlContext.jsonFile(path)

# Show the content of the DataFrame
df.show()

I get this error,

df is not defined.

Any ideas?

How to make code more efficient (For loop?)

I want to make my code more efficient by instead of repeating my code again for level two, having the only change as the random numbers change from 1-10 to 1-100 I can make these a variable making max_number = 10 for level one and max_number = 100 for level two and use the same code, halfing the amount of lines I have in this code. However I am unsure as I am not extremely experience, on how to do this, I have gotten tips to use a for loop, and I was wondering if anyone could help me with this:

Here is my inefficient code:

#The two imports, import modules. They allow me to use functions defined elsewhere such as when I import a random number or
#use sys.exit() to terminate the game
import random
import sys 

#Ask the user for name, use in opening comments

user_name=str(input("Please type in your name: "))
#print instructions
print('''Hello {}! Welcome!
This game is going to develop your number skills! 

So here is how to play:
Firstly, {}, we are going to give you two numbers.
Then you must choose which of these numbers you think is bigger or which number is smaller.
Type this number in and if you are right you will get a point.
Get enough points and you can progress to level two!
''' .format(user_name, user_name))

#Set the scores of both user and computer to 0
user_score = 0
comp_score = 0
level = 1


#Only use the loop when the user score is below three
#Then randomly select to either have type_question to be bigger or smaller and this will define which path the program will take
while user_score < 3:
    bigorsmall = ['bigger', 'smaller']
    type_question = random.choice(bigorsmall)

#Import two random integers, then make sure these integers are not the same
    a1 = random.randint(1, 10)
    a2 = random.randint(1, 10)
    while a1 == a2:
        a2 = random.randint(1, 10)

    print("\nThe two random values are {} and {}. \n " .format(a1, a2))

#-----------------------------------------------------------------------------------------------------------------------
#-----------------------------------------------------------------------------------------------------------------------
#If the type of question is bigger than the loop to ask for the bigger number is used
    if type_question == 'bigger':
        bigger = max(a1, a2)

#Ask the user to type in which they think is the bigger number
#The while strand means that no other integers then the integers given are allowed
#The except strand of the loop means that only integers can be entered

        while True:
            try:
                user_num = int(input("Which number is bigger:"))
                while user_num != a1 and user_num != a2:
                    user_num = int(input("Please pick either {} or {}:" .format(a1, a2)))
                break
            except ValueError:
                print("That is not an integer, please try again.")

#If users number is correct then give the user one point, if not give computer a point.             
        if user_num == bigger:
            print("\nCorrect, you get one point, keep playing you are doing great!")
            user_score+=1
        else:
            print('\nSadly that is wrong, keep trying! The bigger number was {}, the computer gets one point.'.format(bigger))
            comp_score+=1

        print("Your score is: {} \nThe computers score: {}" .format(user_score, comp_score))

#-----------------------------------------------------------------------------------------------------------------------
#This is the same loop as previously however the purpose is for the user to find the SMALLER number
#This loop is used if type_question was computer generated randomly as smaller        
    elif type_question == 'smaller':
        smaller = min(a1, a2)

        while True:
            try:
                user_num = int(input("Which number is smaller:"))
                while user_num != a1 and user_num != a2:
                    user_num = int(input("Please pick either {} or {}:" .format(a1, a2)))
                break
            except ValueError:
                print("That is not an integer, please try again.")


        if user_num == smaller:
            print('\nCorrect, you get one point, keep playing you are doing great!')
            user_score+=1
        else:
            print('\nSadly that is wrong, keep trying! The smaller number was {}, the computer gets one point.'.format(smaller))
            comp_score+=1

        print("Your score is: {} \nThe computers score: {}".format(user_score, comp_score))       

#-----------------------------------------------------------------------------------------------------------------------
#encourage the user to keep playing + allow an option to quit the game

cont_game = input("\n{} you are doing great! If you would like to keep playing type 'yes' \nIf you would like to quit press any key and then enter:" .format(user_name))

if cont_game == "yes":
    print("\nYAY!")
else:
    print("Hope you had fun!")
    sys.exit() 

#-----------------------------------------------------------------------------------------------------------------------
#-----------------------------------------------------------------------------------------------------------------------   
#Start of a level two
#Same rules apply so no need for new information
#This loop is the same as previous loops, so comments are only placed where changes have been made

user_score = 0
comp_score = 0
level = 2

print("YOU HAVE GOT TO LEVEL {}! \nThe numbers now could be between 1 and 100! \nSame rules apply.".format(level))

print("Your score has reset to 0 and you must get 5 points to win at this game.")

while user_score < 5:
    bigorsmall= ['bigger', 'smaller']
    type_question = random.choice(bigorsmall)

#In level two the integers could be from 1 to 100   
    a1 = random.randint(1, 100)
    a2 = random.randint(1, 100)
    while a1 == a2:
        a2 = random.randint(1, 100)

    print("\nThe two random values are {} and {}. \n " .format(a1, a2))

#-----------------------------------------------------------------------------------------------------------------------
    if type_question == 'bigger':
        bigger = max(a1, a2)

        while True:
            try:
                user_num = int(input("Which number is bigger:"))
                while user_num != a1 and user_num != a2:
                    user_num = int(input("Please pick either {} or {}:" .format(a1, a2)))
                break
            except ValueError:
                print("That is not an integer, please try again.")

        if user_num == bigger:
            print("\nCorrect, you get one point, keep playing you are doing great!")
            user_score+=1
        else:
            print('\nSadly that is wrong, keep trying! The bigger number was {}, the computer gets one point.'.format(bigger))
            comp_score+=1

        print("Your score is: {} \nThe computers score: {}" .format(user_score, comp_score))

#-----------------------------------------------------------------------------------------------------------------------
    elif type_question == 'smaller':
        smaller = min(a1, a2)

        while True:
            try:
                user_num = int(input("Which number is smaller:"))
                while user_num != a1 and user_num != a2:
                    user_num = int(input("Please pick either {} or {}:" .format(a1, a2)))
                break
            except ValueError:
                print("That is not an integer, please try again")

        if user_num == smaller:
            print('\nCorrect, you get one point, keep playing you are doing great!')
            user_score+=1
        else:
            print('\nSadly that is wrong, keep trying! The smaller number was {), the computer gets one point.'.format(smaller))
            comp_score+=1

        print("Your score is: {} \nThe computers score: {}".format(user_score, comp_score))       

#-----------------------------------------------------------------------------------------------------------------------
#-----------------------------------------------------------------------------------------------------------------------
#End of game, automatically terminate the program
print("You have won the game, I hope you had fun!")
sys.exit()

How can I pass a time string in the uri path to a Google Cloud Endpoints API?

I am trying to pass a time value as a string in the path to a Google endpoints API.

If I use "15:00" as the value, I get the error:

"No endpoint found for path: conference/v1/dislikedSessions/Keynote/15:00"

The path is found if I send a string like "void" or a number like "1500".

Resource Container

SESS_TYPETIME_REQUEST = endpoints.ResourceContainer(
    message_types.VoidMessage,
    typeOfSession=messages.StringField(1),
    startTime=messages.StringField(2),
)

Endpoint code:

    @endpoints.method(SESS_TYPETIME_REQUEST, SessionForms,
        path='dislikedSessions/{typeOfSession}/{startTime}',
        http_method='GET', name='dislikedSessions')
    def dislikedSessions(self, request):
      """Return sessions that are not of specified type and earlier than start time."""
      filterTime = datetime.strptime(urllib2.unquote(request.startTime), "%H:%M").time()

Yeah, I can use just the numbers (with a change to the strptime method), but I can't understand why the path won't pick up the string like 15:59?

Thanks, Justin.

Plotting 4d-data

I have points in 4 dimensions (lets call them, v,w,y,z) , which I would like to visualize.

My plan is to have two squares, (v x w, y x z), next to each other and then just plot each point twice.

Given two points ([1, 1, 1, 3], [2, 2, 2, 2]) I envision something like this:

Given a small set of points, I could use different colors to show which points on the left correspond to the right. With a large set of points, that would be futile. But perhaps heat maps would then be the best to visualize it?

Or is there some alternative established way to visualize data of higher dimensions within python/matplotlib?

Here's some sample data:

>>> resultsArray[:,:4]
array([[ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.00495236,  0.03919034,  0.00495287,  0.03919042],
       [ 0.00240293,  0.02667374,  0.00220419,  0.02693434],
       [ 0.0011231 ,  0.0191784 ,  0.00104353,  0.01928256],
       [ 0.00547274,  0.04187615,  0.00657255,  0.04043363],
       [ 0.00291993,  0.0286196 ,  0.00292006,  0.02861962],
       [ 0.00128136,  0.01975574,  0.00121107,  0.01984781],
       [ 0.00591335,  0.04531384,  0.00873814,  0.04160714],
       [ 0.00345499,  0.0310103 ,  0.00396032,  0.03034784],
       [ 0.00149387,  0.02056065,  0.0014939 ,  0.02056065],
       [ 0.00274306,  0.02667374,  0.00220419,  0.02659422],
       [ 0.00123893,  0.01948363,  0.00108284,  0.01952189],
       [ 0.00162006,  0.02379926,  0.00143157,  0.02389168],
       [ 0.00347023,  0.0286196 ,  0.00292006,  0.02806932]])

HTML code to show splitted data_frame in one html page using python

I am newbie in html/css so and having question about data showing in html format. What i have is a long list which i want to split and show in html format as two separate columns.For example instead of:

I want to see text as

Col1 Col2   Col1   Col2
1     a      5       b
2     a      6       b 
3     a      7       b
4     a      8       b

How should my html/css code should look like to have that data above in a splitted table?

For the first output,seeing all data in 2 columns in one table i am using code python:

start = '''<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta></head> '''
font_size = '14pt'


style = '''<style media="screen" type="text/css">
table.table1 {

  border-collapse: collapse;
  width: 20%;
  font-size: '''+font_size+''';

}

td {

  text-align: left;
  border: 1px solid #ccc;
}
th {

  text-align: left;
  border: 1px solid #ccc;
  background-color: #072FB1;
}
</style>
'''

title = '''<div align="center"></br><font size = "24"><strong>'''+title+'''</strong></font></br></br></</div>'''

df_data1 = df_data[1:10]
data = df_data1.to_html( index = False, na_rep ='' )
data = data.replace('None', '')

style_headers = 'background-color: #072FB1; color: #ffffff;'
style_status_new ='background-color: #587EF8; color: #ffffff;font-weight:bold'

style_first_col = 'font-weight:bold;'

total = 'TOTAL'
soup = bs4.BeautifulSoup(data)
soup.thead.tr.attrs['style'] = style_headers

html = start+lentos+style+'''<body bgcolor="#FFFFFF">'''+title+time+unicode.join(u'\n',map(unicode,soup))+finish 

try:
    with open(dir_files+'engines_tv_html.html', 'w') as file:
        file.write(html.encode('UTF-8'))
except Exception, e:
    log_error()

Where in df_data[1:10] i am splitting my data to separate data_frames. So the question is to see splitted data_frame(one table in the left and another in on the right) in one html page

Best way to combine Python (backend) and d3.js to create a real-time data visualisation [on hold]

I am looking for is the best way to combine Python (backend) and d3.js to create a real-time data visualisation

I have coded a tool in Python that produces data about a network and updates the information potentially in real-time. I would like to show the development of the network in real-time in the browser. There will be a few thousand nodes.

The Python tool produces a Pandas dataframe like this (I am not sure where the aggregation step should take place though):

The output should be a (real-time interactive) network like this:

How do I "connect" Python with d3.js so that the graph can be visualised in (near) real-time?

Even hints would be very much appreciated.

Saving XML using ETree in Python. It's not retaining namespaces, and adding ns0, ns1 and removing xmlns tags

I see there are similar questions here, but nothing that has totally helped me. I've also looked at the official documentation on namespaces but can't find anything that is really helping me, perhaps I'm just too new at XML formatting. I understand that perhaps I need to create my own namespace dictionary? Either way, here is my situation:

I am getting a result from an API call, it gives me an XML that is stored as a string in my Python application.

What I'm trying to accomplish is just grab this XML, swap out a tiny value (The b:string value user ConditionValue/Default but that's irrelevant to this question) and then save it as a string to send later on in a Rest POST call.

The source XML looks like this:

<Context xmlns="http://ift.tt/1Ukxpk1" xmlns:i="http://ift.tt/ra1lAU">
<xmlns i:nil="true" xmlns="http://ift.tt/1MK06EM" xmlns:a="http://ift.tt/1Ukxnc1"/>
<Conditions xmlns:a="http://ift.tt/1MK06EM">
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://ift.tt/1Ukxnc1"/>
        <Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</Identifier>
        <Name>Code</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://ift.tt/1bQEByM"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://ift.tt/1bQEByM">
                    <b:string>NULLCODE</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://ift.tt/1Ukxnc1"/>
        <Identifier>0af860f6-5611-4a23-96dc-eb3863975529</Identifier>
        <Name>Content Type</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://ift.tt/1bQEByM"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://ift.tt/1bQEByM">
                    <b:string>Standard</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
</Conditions>

My job is to swap out one of the values, retaining the entire structure of the source, and use this to submit a POST later on in the application.

The problem that I am having is that when it saves to a string or to a file, it totally messes up the namespaces:

<ns0:Context xmlns:ns0="http://ift.tt/1Ukxpk1" xmlns:ns1="http://ift.tt/1MK06EM" xmlns:ns3="http://ift.tt/1bQEByM" xmlns:xsi="http://ift.tt/ra1lAU">
<ns1:xmlns xsi:nil="true" />
<ns0:Conditions>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</ns0:Identifier>
<ns0:Name>Code</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>NULLCODE</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>0af860f6-5611-4a23-96dc-eb3863975529</ns0:Identifier>
<ns0:Name>Content Type</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>Standard</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
</ns0:Conditions>

I've narrowed the code down to the most basic form and I'm still getting the same results so it's not anything to do with how I'm manipulating the file normally:

import xml.etree.ElementTree as ET
import requests

get_context_xml = 'http://localhost/testapi/returnxml' #returns first XML example above.
source_context_xml = requests.get(get_context_xml)

Tree = ET.fromstring(source_context_xml)

#Ensure the original namespaces are intact.
for Conditions in Tree.iter('{http://ift.tt/1Ukxpk3'): 
    print "success"

with open('/home/memyself/output.xml','w') as f:
    f.write(ET.tostring(Tree))

memory issue in json conversion

In my pandas program i am reading a csv and converting some columns as json

For ex: my csv is like this:

id_4 col1  col2 .....................................col100
1     43    56  .....................................67
2     46    67   ....................................78

What i want to achieve is:

id_4 json

1    {"col1":43,"col2":56,.....................,"col100":67}
2    {"col1":46,"col2":67,.....................,"col100":78}

The code what i have tried is as follows:

    df = pd.read_csv('file.csv')
    def func(df):         
        d = [
        dict([
        (colname, row[i])        
        for i,colname in enumerate(df[['col1','col2',............,'col100']])

        for row in zip(df['col1'].astype(str),df['col2'].astype(str),...............,df['col100'].astype(str))]

        format_data = json.dumps(d)
        format_data = format_data[1:len(format_data)-1]
        json_data = '{"key":'+format_data+'}' 
        result.append(pd.Series([df['id_4'].unique()[0],json_data],index = headers))                                        
        return df   

    df.groupby('id_4').apply(func)

b = open('output.csv', 'w')
writer = csv.writer(b)
writer.writerow(headers)
writer.writerows(result[1:len(result)])

The CSV contains some 1 lakh data, memory is (15 MB).when i execute this, after a long time the process is killed automatically. I think its a memory issue.

As i am newbie to this python and pandas, Is there any way to optimize the above code to work properly or increasing the memory is the only way.

I am using 5GB RAM Linux System.

kdev-python with python 3 support

I just discovered kdevelop and the kdev-python plugin, it is just awesome, to say the least.

However, on my system (Kubuntu 15.04), when I install kdev-python via apt-get it seems that there is no support for python 3 yet, as I can't execute a python 3 script.

I've read on the Internet that the support for python 3 has been done, but what is the package I have to install? Or is it only available from source? On Launchpad I've read that kdev-python3 is a missing package for the moment, but maybe there is another way.

Thanks.

python custom JSON encoder/decoder not working as expected

I am trying to encode/decode nested objects and I have the list of nested objects as a string instead of JSON objects.

class A:
    def __init__ (self, n, a):
        self.n = n
        self.a = a

class B:
    def __init__ (self, b, listOfA):
        self.b = b
        self.listOfA = []
        for a in listOfA:
            self.listOfA.append(a)

class AEncoder:
    def default (self, obj):
        if isinstance (obj, A):
            return {
                'n' : obj.n
                'a' : obj.a
            }
        return json.JSONEncoder.default(self, obj)

class BEncoder:
    def default (self, obj):
        if isinstance (obj, B):
            return {
                'b' : obj.n
                'listOfA' : json.dumps(obj.listOfA, cls=AEncoder)
            }
        return json.JSONEncoder.default(self, obj)

listOfA = [A('n1', 'a1'), A('n2', 'a2')]
tmpB = B('b', listOfA)

For object A it is working correctly as it is fairly straight forward. The output for B I get is something like this:

{
    "b" : "b"
    "listOfA" : "[{\"n\" : \"n1\", \"a\" : \"a1\"}, {\"n\" : \"n2\", \"a\" : \"a2\"}]"
}

Any ideas where I am wrong? The output should be like this:

{
    "b" : "b"
    "listOfA" : [{"n" : "n1", "a" : "a1"}, {"n" : "n2", "a" : "a2"}]
}

logging module & syslog module, which is better in Python?

I know we can write a message to syslog with logging or syslog module in Python.

My question is: Are there any differences between them? If any, what is it?

detection whether is an e-shop written in Magento

I'm trying to create a script which would do a simple thing. The input is a list of URLs and output is a list of those eshops which are written in Magento.

I've read that there is no way to realize whether is the eshop in Magento or something else but I've also read that there is a lot of signs that could tell you that this web page is using Magento almost 100% sure.

So I've found this page: magento detector which can tell you whether is it a Magento or not so I'm trying to use their information.

They say for example this:

Magento has its user interface files in a directory called /skin/. For the frontend (not the admin ui) the files are located in /skin/frontend. So if this directory exists in the page source then it is very likely that the store runs on Magento.

For example for this eshop: starkk told the detector that it is a magento and the one of condition it meets is the condition I've mentioned above.

How could I check whether the directory exists? I took a look on: http://ift.tt/1SHgVVO using browser but the page raises error.

And additional question: Do you know another and better way how to detect Magento?

Send SMS in python from android phone. (Python)

I want to run a script in python that sends a text message to a selected number from my android phone. Is this even possible with python?

How to use multiprocessing or threading with tkinter

I am trying to make a simple space invaders game and a problem I have run into is getting things t happen at the same time. I have binded the shooting action to the canvas of the game so that when you click a function is called. I would like it so that this function can be called multiple times at once so that multiple "lasers/bullets" can be seen on the screen at any one time. At the minute when you click and a "laser/bullet" is already on screen, the previous one disappears and a new one appears. CODE:

class Game1():

def __init__(self, xcoord1=380, ycoord1=550, xcoord2=400, ycoord2=570):
    self.Master = Master
    self.Master.geometry("800x600+300+150")
    Game1Canvas = Canvas(self.Master, bg="black", height=600, width=800)
    Game1Canvas.place(x=0, y=0)
    self.Canvas = Game1Canvas
    self.Canvas.bind("<Button-1>", self.Shoot)
    self.Ship = self.Canvas.create_rectangle(self.xcoord1, self.ycoord1, self.xcoord2, self.ycoord2, fill = "red")

def Shoot(self):
    self.LaserLocation = 0
    for self.LaserLocation in range(0 , 112):
        Master.after(1, self.Canvas.create_rectangle(self.xcoord1, self.ycoord1 - (self.LaserLocation * 5), self.xcoord2 - 5, self.ycoord2 - (self.LaserLocation * 5), fill = "pink", tag=str(CurrentTag)))
        Master.update()
        self.Canvas.delete(str(CurrentTag))

This is a much more "dumbed" down version of the code at the minute because I've been trying a bunch of different ways to get this working and it's a mess. I am aware of the multiprocessing and threading imports and I have tried them both but am unable to get them working for my code. If someone could reply back with a solution I would be very grateful. Cheers.

Error message related to SIP & PyQt4

Currently, I am involved in the development of an graphical experiment builder, which is based on python. Running the stable version works perfectly fine. However, as soon as I try-out the latest snapshot, I get a rather cryptic error message. Based on some googling, I suppose it is related to the PyQt4 and sip packages that are used.

Here is the error:

python: /build/buildd/sip4-4.15.5/siplib/siplib.c:8407: sip_api_can_convert_to_type: Assertion `(((td)->td_flags & 0x0007) == 0x0000) || (((td)->td_flags & 0x0007) == 0x0002)' failed.

Aborted (core dumped)

Also, while googling I found this discussion. The error message there is the same, but the situation a bit different.

If anyone could give some pointers to why this error occurs, or even how to fix it, I'd be very thankful.

pygame bullet physics messed up by scrolling

the code below is the bullet class for my shooter game in pygame. as you can see if you run the full game (http://ift.tt/1LYUzLv) the code works great to fire bullets at the cursor as long as the player isn't moving. However, I recently added scrolling, where I change a global offsetx and offsety every time the player gets close to an edge. These offsets are then used to draw each object in their respective draw functions.

unfortunately, my bullet physics in the bullet's init function no longer work as soon as the player scrolls and the offsets are added. Why are the offsets messing up my math and how can I change the code to get the bullets to fire in the right direction?

class Bullet:
    def __init__(self,mouse,player):
        self.exists = True
        centerx = (player.x + player.width/2)
        centery = (player.y + player.height/2)
        self.x = centerx
        self.y = centery
        self.launch_point = (self.x,self.y)
        self.width = 20
        self.height = 20
        self.name = "bullet"
        self.speed = 5
        self.rect = None
        self.mouse = mouse
        self.dx,self.dy = self.mouse

        distance = [self.dx - self.x, self.dy - self.y]
        norm = math.sqrt(distance[0] ** 2 + distance[1] ** 2)
        direction = [distance[0] / norm, distance[1] / norm]
        self.bullet_vector = [direction[0] * self.speed, direction[1] * self.speed]

    def move(self):
        self.x += self.bullet_vector[0]
        self.y += self.bullet_vector[1]

    def draw(self):
        make_bullet_trail(self,self.launch_point)
        self.rect = pygame.Rect((self.x + offsetx,self.y + offsety),(self.width,self.height))
        pygame.draw.rect(screen,(255,0,40),self.rect)