In this tutorial, I'll thoroughly explain how, with sockets, you can connect to websites and retrieve information. Even if you have no knowledge with sockets, you should learn a bit from reading this. I'll try to explain everything as easy as possible.
Contents
Introduction
You've probably seen sockets being used before- in other add-ons, in the help file, or maybe you've even tried to use them yourself. Whatever the case, sockets are definitely a handy part of mIRC scripting. While sockets can be used for a variety of things, in this tutorial I'm going to focus on a specific task- connecting to a web page and retrieving information from it. So, here we go... stay with me.
Opening the socket
In order to start the process of connecting to a website with a socket, the socket has to be opened. This is done with the /sockopen command, which has a fairly simple syntax.
/sockopen <name> <address> <port>
The name here is the socket name, which we'll refer to in following events. I'd recommend naming this something you can easily remember, and probably something relevant (having to do with the function of the socket.)
The address is the website you want to open the socket to, and it'll be the website you'll eventually get the information from. This is simply the domain; don't worry about the folder or anything after the domain. For example, if the full address to the page I wanted information from was http://www.mircscripts.org/about.php, we'd only include mircscripts.org in the address part of this command. The rest of the URL will be stated later.
Finally, the port is obviously the port you're connecting to. Don't worry about this for now. It's usually 80 when accessing a web page.
Now that we've got the command down, let's start something off. Let's say I wanted to get the forum stats from mIRCscripts.org's forum page. This will be the task that I'll provide examples for as we move along. Keep in mind, the forum stats are printed near the top of the forum page. Our command would be:
/sockopen msforum mircscripts.org 80
Now let's move on.
The on SOCKOPEN event
Once we've opened a socket with /sockopen to a specific domain and port, the socket will connect to the domain and trigger the on SOCKOPEN event. In the event, we should tell the socket to go to a particular page on the website. We can do this by using another socket command, /sockwrite, to send a message to the socket. In this case, we should /sockwrite "GET", the page to access, and HTTP/1.1 or HTTP/1.0. Don't worry about the HTTP stuff for now, just use HTTP/1.1. Here's what we have:
Note: If the page you're trying to get information from a page that is the domain (for example, you wanted to get information from ms.org's main page), you'd simply put a slash after GET, with no other folders or page after that. (sockwrite -n $sockname GET / HTTP/1.1)
mIRC's help file syntax for /sockwrite is defined as: /sockwrite [-tnb] <name> [numbytes] <text|%var|&binvar>
Don't worry, though, because it's essentially (for this purpose) just /sockwrite -n <name> <text>
The name is the socket name you've already connected with /sockopen, and the text is the message/text you want to send to the socket. Therefore, in the example above, I could've just used msforum instead of $sockname, which returns the current socket in the event. Either one would work.
Once we've sent the socket to a specific page, we're ready to deal with the information on the page.
Reading the information
So far, our socket msforum is connected to the exact page you want your information from. Now, it's time to use the event on SOCKREAD to handle the information. Basically, this event is triggered each time a line of text is read from the web page. Not only the physical appearance of the web page will be sent as information, though. The whole page's HTML (or whatever language) will be.
Handling the information sent from the page can be the most difficult in this process, but it's still not too hard. This is how our event should go:
With each line of data sent from the website to the socket, the commands in the event are processed. The $sockerr identifier checks if a socket error has occurred. If an error occurs ( if ($sockerr) { ), it lets the user know by echoing, then halts the script from continuing. The socket is automatically closed when there's an error. If there's not an error, however, the /sockread command is used. This sets the data received from the web page to a variable of your choice.
Now, if statements will come into play. You'll need them to run "checks" with the HTML/code received from the web page, and the HTML you're looking for that has the data you want in it. A handy method to help you with this is by opening your browser to the specific page, in this case open it here, and view its source (right click, view source). Now, look for the text or data we want in the source. The find feature in notepad is useful here.
In this example, there are 2 lines that have the information we want in them. Here they are, as they'll appear when being set with /sockread.
<span>Stats: 13064 posts / 1994 threads / 17322 users</span><br>
<span>Last 24 hours: 83 posts / 34 new users</span><br>
Now that we've got the text we want, our goal is to turn this into an if statement, where we check to see if the line that the on SOCKREAD event is currently reading is indeed the line we want. A good idea would be to replace any of the changing parts of the line with a * character, and then using that string in an if statement with the iswm operator. For example:
if (<span>Stats: * posts / * threads / * users</span><br> iswm %temptext)
Remember, iswm is a wildcard operator, so this method of replacing changing parts with * is a good idea. And don't forget that %temptext is our current line from the web page that's being read from. You don't need to use this method; you can deal with this any way you want. Just remember, the goal of this line is to get rid of the HTML lines that don't have the information you're looking for, and handle the ones that do. You can use a variety of if operators, tokens, or anything else that suits your needs.
Here is our final event code:
The line in red is the actual command of what to do with the text. The $htmlfree identifier I specified isn't a default mIRC identifier, but we'll make it a custom identifier that strips HTML code, leaving only the text. This (using custom identifiers to help) will be covered more later in the tutorial. In conclusion of this event, we echo the stats to the active window. Processing the information will be covered next.
Processing the information
By processing information, I'm talking about whatever you want to do with the isolated information you now have. This is up to you and your scripting style, whether it be applying it to an editbox inside a dialog, echoing it to a custom window, displaying it in a picture window, or simply (as in the above example), echoing it to the active window.
This is a very simple step, as your text lies in $htmlfree(%temptext), in the above example. Your variable name and custom identifier for stripping HTML name might differ. So, if you're applying the text to a dialog ID, you could do /did -a dialogname id $htmlfree(%temptext). Once again, this is completely up to you.
Here's a table summarizing what you can do with the information, using the variable and identifier we used in our example. Messaging to all channels isn't recommended (same with messaging to the active channel or query), because if you're dealing with more than two or three lines of data it can cause a flood.
Socket errors
As with any part of mIRC scripting, there will frequently be mistakes in code. With sockets, it's a bit easier to catch the error than with other parts of scripting. There are some identifiers and methods that can help you debug, such as:
$sock($sockname).wsmsg - This will return an error message, if applicable.
$sockerr - In the example above, I used this to check if there was an error at all. If so, halt.
If you're still having trouble, adding echo -s %var in your code can sometimes help you find what mIRC is actually reading from the web page. %var would be the variable you're /sockread'ing to. Make sure the line is after you've actually used /sockread to the variable. You can also resort to trying HTTP/1.0 in your on SOCKOPEN event.
Custom identifiers and aliases
Including custom identifiers and aliases in your code can be a big help. It can usually shorten code, or save extra work from being done. I mentioned the $htmlfree identifier above, which would strip the HTML tags from code, leaving only the text behind. Obviously, that helps a lot when dealing with sockets and handling HTML information.
fubar created a snippet that strips HTML using tokens and a loop. Or, you can use regular expressions. Here is the $htmlfree identifier I stated in above examples:
Inputting information, then getting information
Note: If you don't plan on inputting information in a form and retrieving output information, skip this step, as it may only confuse you.
To retrieve ideal information from a page, some require you to input certain information and other information will be given to you. For example, a search engine. You input information that you want to search for in an editbox, click "Search", and the search results are given to you.
You can "input" information with sockets by using a different URL in the on SOCKOPEN event, with the GET attribute. Most forms you fill out, like the search editbox I mentioned above, will display the information you put in the target URL. The usual format is http://website.com/page.php?informationname=data&moreinfo=moredata&moreinfo=moredata.
This isn't as complex as it may look. Here's an example. Let's say I search ms.org's script archives for the text "searchword". By trying this yourself, you can see the outcome URL in your browser's address box. This is what it should show:
http://www.mircscripts.org/archive.php?stype=all&squery=searchword&sorder=file_date&ssort=desc&perpage=50
The information it holds is self-explanatory, and it usually is in most cases. Obviously, the "stype" title holds what kind of script you're looking for (the choices are scripts, add-ons, etc..). "Squery" holds what the actual search string/phrase is. And, "sorder", "ssort", and "perpage" hold how to sort the results, ascending or descending, and how many to display per page, respectively.
To get this kind of URL for whatever webpage you input the data from, simply navigate there with your browser and input some kind of data. Take the URL from your browser's address bar and replace the variable data (changing data; the data that would be different in different searches) with an mIRC variable, and you can use that URL in your GET attribute in the on SOCKOPEN event. For example...
This way, all you need to do is have the script set the variables %user and %info and your input will work properly. You can do this by altering the initiation alias.
Now that the variables are all set, the URL will work properly once the socket is opened. The page that will now load with this URL will be the page with all your "custom" data on it, or basically, the data that you'd normally get if you went to the page with your browser, inputted information, and clicked the button. You can now handle the data normally.
Note: Don't forget to unset the variables when you're finished getting the data. This is usually at the same time when you close your socket with /sockclose.
Examples
Let's say we want to connect to ms.org's forum page and retrieve the posts in the last 24 hours, and then make some calculations and display them.
Many more examples for you to learn from can be found at the snippet page, where a lot of the snippets use sockets to get information. You can also find some socket snippets at my mIRC projects page, located on my website.
Contents
- › Introduction
- › Opening the socket
- › on SOCKOPEN event
- › Reading the information
- › Processing the information
- › Socket errors
- › Custom identifiers and aliases
- › Inputting information, then getting information
- › Examples
- › Conclusion/Credits
Introduction
You've probably seen sockets being used before- in other add-ons, in the help file, or maybe you've even tried to use them yourself. Whatever the case, sockets are definitely a handy part of mIRC scripting. While sockets can be used for a variety of things, in this tutorial I'm going to focus on a specific task- connecting to a web page and retrieving information from it. So, here we go... stay with me.
Opening the socket
In order to start the process of connecting to a website with a socket, the socket has to be opened. This is done with the /sockopen command, which has a fairly simple syntax.
/sockopen <name> <address> <port>
The name here is the socket name, which we'll refer to in following events. I'd recommend naming this something you can easily remember, and probably something relevant (having to do with the function of the socket.)
The address is the website you want to open the socket to, and it'll be the website you'll eventually get the information from. This is simply the domain; don't worry about the folder or anything after the domain. For example, if the full address to the page I wanted information from was http://www.mircscripts.org/about.php, we'd only include mircscripts.org in the address part of this command. The rest of the URL will be stated later.
Finally, the port is obviously the port you're connecting to. Don't worry about this for now. It's usually 80 when accessing a web page.
Now that we've got the command down, let's start something off. Let's say I wanted to get the forum stats from mIRCscripts.org's forum page. This will be the task that I'll provide examples for as we move along. Keep in mind, the forum stats are printed near the top of the forum page. Our command would be:
/sockopen msforum mircscripts.org 80
Now let's move on.
The on SOCKOPEN event
Once we've opened a socket with /sockopen to a specific domain and port, the socket will connect to the domain and trigger the on SOCKOPEN event. In the event, we should tell the socket to go to a particular page on the website. We can do this by using another socket command, /sockwrite, to send a message to the socket. In this case, we should /sockwrite "GET", the page to access, and HTTP/1.1 or HTTP/1.0. Don't worry about the HTTP stuff for now, just use HTTP/1.1. Here's what we have:
on *:sockopen:msforum:{ sockwrite -n $sockname GET /forumlist.php HTTP/1.1 ; This is telling the socket to connect to the specific page, using "GET". sockwrite -n $sockname Host: mircscripts.org $+ $crlf $+ $crlf ; This states the host once again. } |
mIRC's help file syntax for /sockwrite is defined as: /sockwrite [-tnb] <name> [numbytes] <text|%var|&binvar>
Don't worry, though, because it's essentially (for this purpose) just /sockwrite -n <name> <text>
The name is the socket name you've already connected with /sockopen, and the text is the message/text you want to send to the socket. Therefore, in the example above, I could've just used msforum instead of $sockname, which returns the current socket in the event. Either one would work.
Once we've sent the socket to a specific page, we're ready to deal with the information on the page.
Reading the information
So far, our socket msforum is connected to the exact page you want your information from. Now, it's time to use the event on SOCKREAD to handle the information. Basically, this event is triggered each time a line of text is read from the web page. Not only the physical appearance of the web page will be sent as information, though. The whole page's HTML (or whatever language) will be.
Handling the information sent from the page can be the most difficult in this process, but it's still not too hard. This is how our event should go:
on *:sockread:msforum:{ if ($sockerr) { echo -a Error. halt } else { var %temptext sockread %temptext ; handling here } } |
Now, if statements will come into play. You'll need them to run "checks" with the HTML/code received from the web page, and the HTML you're looking for that has the data you want in it. A handy method to help you with this is by opening your browser to the specific page, in this case open it here, and view its source (right click, view source). Now, look for the text or data we want in the source. The find feature in notepad is useful here.
In this example, there are 2 lines that have the information we want in them. Here they are, as they'll appear when being set with /sockread.
<span>Stats: 13064 posts / 1994 threads / 17322 users</span><br>
<span>Last 24 hours: 83 posts / 34 new users</span><br>
Now that we've got the text we want, our goal is to turn this into an if statement, where we check to see if the line that the on SOCKREAD event is currently reading is indeed the line we want. A good idea would be to replace any of the changing parts of the line with a * character, and then using that string in an if statement with the iswm operator. For example:
if (<span>Stats: * posts / * threads / * users</span><br> iswm %temptext)
Remember, iswm is a wildcard operator, so this method of replacing changing parts with * is a good idea. And don't forget that %temptext is our current line from the web page that's being read from. You don't need to use this method; you can deal with this any way you want. Just remember, the goal of this line is to get rid of the HTML lines that don't have the information you're looking for, and handle the ones that do. You can use a variety of if operators, tokens, or anything else that suits your needs.
Here is our final event code:
on *:sockread:msforum:{ if ($sockerr) { echo -a Error. halt } else { var %temptext sockread %temptext if (<span>Stats: * posts / * threads / * users</span><br> iswm %temptext) || (<span>Last 24 hours: * posts / * new users</span><br> iswm %temptext) { echo -a - echo -a $htmlfree(%temptext) } } } |
Processing the information
By processing information, I'm talking about whatever you want to do with the isolated information you now have. This is up to you and your scripting style, whether it be applying it to an editbox inside a dialog, echoing it to a custom window, displaying it in a picture window, or simply (as in the above example), echoing it to the active window.
This is a very simple step, as your text lies in $htmlfree(%temptext), in the above example. Your variable name and custom identifier for stripping HTML name might differ. So, if you're applying the text to a dialog ID, you could do /did -a dialogname id $htmlfree(%temptext). Once again, this is completely up to you.
Here's a table summarizing what you can do with the information, using the variable and identifier we used in our example. Messaging to all channels isn't recommended (same with messaging to the active channel or query), because if you're dealing with more than two or three lines of data it can cause a flood.
Where? | Command |
Dialog | /did -a dialogname id $htmlfree(%temptext) |
Status window | /echo -s $htmlfree(%temptext) |
Active window | /echo -a $htmlfree(%temptext) |
Custom window | /aline @window $htmlfree(%temptext) |
All channels | /amsg $htmlfree(%temptext) |
Active channel/query | /msg $active $htmlfree(%temptext) |
Socket errors
As with any part of mIRC scripting, there will frequently be mistakes in code. With sockets, it's a bit easier to catch the error than with other parts of scripting. There are some identifiers and methods that can help you debug, such as:
$sock($sockname).wsmsg - This will return an error message, if applicable.
$sockerr - In the example above, I used this to check if there was an error at all. If so, halt.
If you're still having trouble, adding echo -s %var in your code can sometimes help you find what mIRC is actually reading from the web page. %var would be the variable you're /sockread'ing to. Make sure the line is after you've actually used /sockread to the variable. You can also resort to trying HTTP/1.0 in your on SOCKOPEN event.
Custom identifiers and aliases
Including custom identifiers and aliases in your code can be a big help. It can usually shorten code, or save extra work from being done. I mentioned the $htmlfree identifier above, which would strip the HTML tags from code, leaving only the text behind. Obviously, that helps a lot when dealing with sockets and handling HTML information.
fubar created a snippet that strips HTML using tokens and a loop. Or, you can use regular expressions. Here is the $htmlfree identifier I stated in above examples:
alias -l htmlfree { ; It's local because it won't be used by the command line, only this file. ; Local aliases avoid conflicting names. var %x, %i = $regsub($1-,/(^[^<]*>|<[^>]*>|<[^>]*$)/g,$null,%x), %x = $remove(%x, ) return %x } |
Inputting information, then getting information
Note: If you don't plan on inputting information in a form and retrieving output information, skip this step, as it may only confuse you.
To retrieve ideal information from a page, some require you to input certain information and other information will be given to you. For example, a search engine. You input information that you want to search for in an editbox, click "Search", and the search results are given to you.
You can "input" information with sockets by using a different URL in the on SOCKOPEN event, with the GET attribute. Most forms you fill out, like the search editbox I mentioned above, will display the information you put in the target URL. The usual format is http://website.com/page.php?informationname=data&moreinfo=moredata&moreinfo=moredata.
This isn't as complex as it may look. Here's an example. Let's say I search ms.org's script archives for the text "searchword". By trying this yourself, you can see the outcome URL in your browser's address box. This is what it should show:
http://www.mircscripts.org/archive.php?stype=all&squery=searchword&sorder=file_date&ssort=desc&perpage=50
The information it holds is self-explanatory, and it usually is in most cases. Obviously, the "stype" title holds what kind of script you're looking for (the choices are scripts, add-ons, etc..). "Squery" holds what the actual search string/phrase is. And, "sorder", "ssort", and "perpage" hold how to sort the results, ascending or descending, and how many to display per page, respectively.
To get this kind of URL for whatever webpage you input the data from, simply navigate there with your browser and input some kind of data. Take the URL from your browser's address bar and replace the variable data (changing data; the data that would be different in different searches) with an mIRC variable, and you can use that URL in your GET attribute in the on SOCKOPEN event. For example...
on *:sockopen:hello:{ sockwrite -n $sockname GET /info.php?user= $+ %user $+ &info= $+ %info HTTP/1.1 ; Or this could be: $+(/info.php?user=,%user,&info=,%info) using $+() sockwrite -n $sockname Host: mircscripts.org $+ $crlf $+ $crlf } |
alias hello { if ($2-) { set %user $1 set %info $2 ; Variables to be used in the future URL are being set here! sockopen hello eifjdfijdojf.com 80 } else { echo -a You didn't specify a user and the info. } } |
Note: Don't forget to unset the variables when you're finished getting the data. This is usually at the same time when you close your socket with /sockclose.
Examples
Let's say we want to connect to ms.org's forum page and retrieve the posts in the last 24 hours, and then make some calculations and display them.
alias msactive sockopen forum mircscripts.org 80 on *:sockread:forum:{ if ($sockerr) { echo -a Error. halt } else { var %' sockread %' if (<span>Last 24 hours: * posts / * new users</span><br> iswm %') { var %stripped = $nohtml(%'), %24 = $gettok(%stripped,4,32), %new = $gettok(%stripped,7,32) ; Right now we have %24 as the posts in the last 24 hours ; And %new as how many new users in the last 24 hours window @hi aline @hi %' aline @hi Average of $round($calc(%24 /24),1) posts per hour aline @hi Average of $round($calc(%new /24),2) new users every hour sockclose $sockname halt } } } alias -l nohtml { var %x, %i = $regsub($1-,/(^[^<]*>|<[^>]*>|<[^>]*$)/g,$null,%x), %x = $remove(%x, ) return %x } on *:sockopen:forum:{ sockwrite -n $sockname GET /forumlist.php HTTP/1.1 sockwrite -n $sockname Host: mIRCscripts.org $+ $crlf $+ $crlf } |
0 comments
Post a Comment