4

I have trouble doing this...

You may want to get the same result as in http://i.joyton.com:2010 Using this image.enter image description here, and other parameters keep default.

def search_img(item, image_name):
try:
    f = open(image_name, 'rb')
    img = f.read()
    print type(img)
except IOError, e:
    print 'fail to open %s' % image_name
    print e
    return None

ts = str(time.time())

m = md5.new('testsearch_by_image' + item)
m.update(ts)
m.update('0123456789')
sign = m.hexdigest()

params = urllib.urlencode( {
    'item': item,
    'app_key': 'test',
    'cmd':'search_by_image',
    'sign':sign,
    'img_file':img,
    'extra':'',
    'time_stamp':ts,
    })

headers = {'Content-type': 'application/x-www-form-urlencode',
           'Accept': 'text/plain'}

conn = httplib.HTTPConnection('i.joyton.com', 2010)
conn.request('POST', '', params, headers)
response = conn.getresponse()

print response.status, response.reason
print response.read()
conn.close()
return response.read()

if __name__ == '__main__':
    search_img('book', 'f:\\book_001.jpg')

In the browser everything works perfectly, but my script does not. Sometimes the script returns the right result; sometimes it gets other books, sometimes it gets nothing at all. When it gets other books, these books are usually searched by others recently.

4
  • You would probably need a multipart form post rather than normal urlencoded post. Commented Jun 20, 2012 at 4:47
  • I change Content-type to multipart/form-data, but nothing changed. Commented Jun 20, 2012 at 6:03
  • I can achive this by using another methods. It works well. I use httpWatch to monitor what the browser send to server, and send the same data to server using socket. But it is too stupid, and not fexible. Commented Jun 20, 2012 at 6:38
  • browsers also send cookies and might make the same selection over-and-over-again based on that information. You Python code does not. Logout from the site (if necessary), flush the cookies for this site and see if you can create the same problems as with your script. Using mechanize is another way to achieve this kind of thing in python without having to hang your browser from puppetstrings. Commented Jun 20, 2012 at 6:57

1 Answer 1

2

Here is your code modified to do a multipart/form-data. Although this didn't work, when I tested at my PC against your URL, it probably needs some hacking (may be the sign isn't proper or something) before you can get it to work.

import mimetypes
import string
import random
import time
import md5
import httplib

def upload(fields,files):
    boundaryChars = list(string.lowercase) + list(string.uppercase) + \
                    [str(x) for x in range(10)] + ['_'*10]
    random.shuffle(boundaryChars)    

    boundary = '----------RaNdOm_crAPP'+''.join(boundaryChars[:20])
    CRLF = '\r\n'
    elem = []
    for key in fields:
        elem.append('--' + boundary)
        elem.append('Content-Disposition: form-data; name="%s"' % key)
        elem.append('')
        elem.append(fields[key])
    for (key, filename,value) in files:
        elem.append('--' + boundary)
        elem.append('Content-Disposition: form-data; name="%s"; filename="%s"' % (key, filename))
        elem.append('Content-Type: %s' % mimetypes.guess_type(filename)[0] or \
                        'application/octet-stream')
        elem.append('')
        elem.append(value)
    elem.append('--' + boundary + '--')
    elem.append('')
    body = CRLF.join(elem)
    content_type = 'multipart/form-data; boundary=%s' % boundary
    return content_type, body

def search_img(item, image_name):
    try:
        f = open(image_name, 'rb')
        img = f.read()
    except IOError, e:
        print 'fail to open %s' % image_name
        print e
        return None

    ts = str(time.time())

    m = md5.new('testsearch_by_image' + item)
    m.update(ts)
    m.update('0123456789')
    sign = m.hexdigest()

    #params = urllib.urlencode( )

    contentType,body = upload({
        'item': item,
        'app_key': 'test',
        'cmd':'search_by_image',
        'sign':sign,
        #'img_file':img,
        'extra':'',
        'time_stamp':ts,
        },
        [('img_file', image_name, img)]
    )
    headers = {
        'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Content-type': contentType,
        'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5',
        'Host':'i.joyton.com:2010',
        'Origin':'http://i.joyton.com:2010',
        'Referer':'http://i.joyton.com:2010/'
    }
    #print c
    #print body

    conn = httplib.HTTPConnection('i.joyton.com', 2010)
    conn.request('POST', '/', body, headers)
    response = conn.getresponse()

    print response.status, response.reason
    print response.read()
    conn.close()
    return response.read()

if __name__ == '__main__':
    search_img('book', 'iMgXS.jpg') #the same image.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.