Data-mining / Security

March 24, 2009 at 10:39:57
Specs: Windows XP
I am trying to tackle a networking project and need a little advice.

I am using a program that receives a continuous stream of data via the internet for several hours a day and so requires an active internet connection during those hours. The program displays the information on a graph as the data arrives. The same program allows me to statistically analyse the data. My main concern is that I cannot be sure that the program isn’t surreptitiously sharing my statistical analysis with the developers of the program or the third party that provides the data. In other words, the program could be engineered to take advantage of its constant access to the internet and participate in some sort of data-mining without my knowledge.

The obvious solution is a firewall, but the difficulty is that because the program has to have both inbound and outbound access to the internet, I would have to know exactly what information to block and what information to allow.

After some research on this, I found out about packet sniffing and used Wireshark to look at the data that the program sends and receives. The parts I can understand seem innocent enough, such as my username and password required to log into the data service. But there seems to be some traffic that is somehow encoded.

I have come to the conclusion that perhaps the best way to ensure absolute security is by using a network. Here is my laymen’s explanation of what I hope to achieve.

Instead of using just one computer, a network of two computers will be used.

The first computer will have a copy of the program on it and will have complete access the internet.

The data that is received via the internet in the first computer will somehow be forwarded onto a second computer.

The second computer will also have a copy of the program on it but will have no connection to the internet and will only receive the data forwarded to it by the first computer.

What would be the simplest way to proceed with implementing such a two-computer strategy? Or, is there a single-computer strategy that would ensure an equal level of security?


See More: Data-mining / Security

Report •


#1
March 24, 2009 at 12:09:33
I think what you really need is an encrypted VPN tunnel between your remote host and the PC you use to view the data on. This way nobody could monitor the data being transferred as it would be inside an encrypted VPN tunnel.

The parts I can understand seem innocent enough, such as my username and password required to log into the data service

Innocent enough!?!? Dude, if someone can do a packet capture on your traffic and get your username/password you use to login to the data service, it follows that they could then login to the same data service (as you) and do whatever they pleased to the computer(s) at the other end.

Do some googling on VPN tunnels and see if that isn't more what you need. It's worth noting that a lot of SOHO routers are capable of creating a VPN tunnel. Since SOHO routers are relatively inexpensive this could be a viable solution for you.


Report •

#2
March 24, 2009 at 13:00:13
Hello,

Thanks for the reply. Perhaps I need to clarify a few matters.

I guess innocent wasn’t the best choice of words :) . I was just trying to illustrate the fact that some information is being sent in plaintext and some is being encrypted. My concern is that amongst the encrypted information could be information that I don’t want to share.

I was hoping that by using a network of two computers I could have one computer get all the information via the internet and then forward it to the second computer which wouldn’t have any connection to the internet. That way, I could use the internet-connected computer to retrieve the data, have that data forwarded to the second computer, and do all of my statistical analysis on the second computer. The second computer would have some form of firewall or other traffic management system to block all communication except that which was forwarded from the first computer.

I am no expert, but I don’t think that a VPN in and of itself would be the solution as the program itself can share data. So its not so much a matter of people seeing the data, but the fact that the data is being sent by the program. The program has to somehow be stopped from sharing data, thus the idea of the network with two computers.

Hopefully this provides a better picture of my circumstance!


Report •

#3
March 24, 2009 at 14:07:35
I was hoping that by using a network of two computers I could have one computer get all the information via the internet and then forward it to the second computer which wouldn’t have any connection to the internet. That way, I could use the internet-connected computer to retrieve the data, have that data forwarded to the second computer, and do all of my statistical analysis on the second computer.

The problem as I see it is, even with your scenario above, you still have the data travelling across the internet in an unencrypted format. Segmenting the PC you use to perform statistical analysis only protects it. The data, and computer at the other end, are still unprotected. An encrypted VPN tunnel protects both ends of the link as well as the data traversing it. This is why I recommended that.

I am no expert, but I don’t think that a VPN in and of itself would be the solution as the program itself can share data. So its not so much a matter of people seeing the data, but the fact that the data is being sent by the program. The program has to somehow be stopped from sharing data, thus the idea of the network with two computers.

Sure the program itself can share data....if it didn't, you couldn't use it to provide your analysis computer with data to analyze could you!? LOL

But, like most any software that does allow sharing, I suspect you (or someone else in control) has to decide what to share and with whom.

If you're concerned about the 'server' sharing data with just anybody, then you have to secure that computer. Or get whoever is in charge of it to do so. As I said, in most every case I've ever seen, someone has to decide what is shared with whom. Set it up to share only with you and that concern is taken care of.


Report •

Related Solutions

#4
March 24, 2009 at 16:44:32
Hello again,

I certainly appreciate the time you have spent trying to help me out! However, I seem to have made things more complicated than I intended by not specifically describing my circumstance.

I trade stocks online and use a specialized computer program (I will call it Program-X) to display stock charts. The program can be combined with a third party datafeed provider to receive real-time stock price data. For instance, if I load a chart to show the current price of General Motors, I can see the price of General Motors in real time as it is traded on the New York Stock Exchange. Program-X also allows me to develop customized mathematical studies that help me identify when to buy and sell.

As you can imagine, to do this, Program-X requires both inbound and outbound internet traffic to my computer so that it can communicate with the datafeed provider and thereby display price data on any particular stock.

While this in theory is not a security risk, my concern is that Program-X could take advantage of the inbound/outbound stream to send additional data (beyond that which is necessary to display a chart) to the company that makes Program-X or indeed the datafeed provider. For instance data such as the formula for one of my customized mathematical studies could also be sent in that inbound/outbound stream.

As I said in my earlier post, the obvious solution is a firewall, but the difficulty is that because Program-X has to have both inbound and outbound access to the internet, I would have to know exactly what information packets to block and what information packets to allow. This is doubly difficult because some of the packets appear to be encrypted.

I was hoping that there might be a way to solve this problem by setting up a network of two computers. One computer would have a copy of Program-X on it and could request/receive data. However I would not input any of my customized mathematical studies--the program would simply be a means of retrieving price data. This price data would then be forwarded to a second computer. The second computer would also have a copy of Program-X, but would have no way of sending outbound data and could only receive inbound data.

I can see why you suggested the VPN approach, but unfortunately, there is no way that I can establish a VPN connection directly with the datafeed provider. Even if it were possible, the VPN would only serve to protect against interception of the stream and wouldn’t preclude Program-X sending proprietary data—it would only encrypt the data being sent.


Report •

#5
March 24, 2009 at 17:34:55
Curt R is trying to explain to you that your solution is only protecting your data on your LAN, at best and if you have a wireless router, it might not even be doing that.

If you think that program X is phoning home and don't understand Wireshark's capture output, run netstat -an to see if there are any unwanted connections to program X's site or anywhere else.

May I ask what the big secret is about program X? If you provide the name, someone might be able to help you establish if the developers are ethical or not.

I've done many mathematical studies on stocks as well and they all implied to buy low and sell high.


Report •


Ask Question